Your DID model assumes that some cities received the “treatment” (higher tax rate) while others did not. If the treatment is applied uniformly across all cities within a province, how did you define city-level variation within the same province? For instance, were all prefecture-level cities in Jiangsu province assigned to the treatment group? If so, the variation is not between individual cities but between provinces with different rate decisions. This would require a different methodological approach (e.g., a staggered adoption DID at the provincial level or a stronger justification for why city-level effects would differ within a province under a uniform tax rate).
If the tax rate is set provincially, cities in your “control group” provinces are not subject to a different “policy”; they are subject to a different provincial tax rate. This is a crucial distinction. The control group is not defined by the absence of the EPT Law (it applies nationwide) but by a different intensity of the same law. This is a valid approach only if the assignment of provinces to higher or lower rates is as good as random and not correlated with other factors affecting carbon emissions (e.g., existing provincial-level industrial policies, economic development trajectories, or environmental governance capacity). The parallel trends test at the city level may not adequately capture these pre-existing provincial-level differences.
In its current form, the core explanatory variable did might not accurately capture a city-specific “treatment” of a raised tax rate. Instead, it likely proxies for a provincial-level policy intensity shock. This mis-specification could bias your estimates. The positive spatial spillover effect you find (aggravating emissions in neighboring cities) could be partially explained by this, as neighboring cities often belong to the same or economically integrated provinces.
Moreover,
Was the treatment assignment (group_i) based on the city’s own decision (and if so, on what legal or administrative basis?) or strictly on the province’s decision?
If it is provincial, have you considered conducting the main analysis at the provincial level or clustering standard errors at the province level to account for the policy’s aggregate nature?
Have you tested for pre-existing trends or systematic differences between provinces that chose higher rates versus those that did not?