2

My data is currently organized in Stata as follows:

input str2 Country gdp_2015  gdp_2016  gdp_2017  imports_2016  imports_2017   exports_2016
"A"         11        12        13       5             6                 8               5 
"B"         11         .        .        5             6                 10               5 
"C"        12          13       .        5             6                  8               5 
end 
gen net_imports = (imports_2017-foodexport_2017)
gen net_imports_toGDP = (net_imports/gdpcurrent_2017)

The code works well but only created a variable if a country has 2017 data, but I would like to essentially create an import to GDP ratio, based on the most recent observation available for GDP.

Nick Cox
  • 30,617
  • 6
  • 27
  • 44
maldini425
  • 185
  • 7

1 Answers1

4

You could simply replace the missing data as follows:

replace gdp_2016 = gdp_2015 if mi(gdp_2016)
replace gdp_2017 = gdp_2016 if mi(gdp_2017)

However, a more general approach would begin by reshaping your data from wide to long:

reshape long gdp_ imports_ exports_, i(Country) 

See help reshape for more detail on the command. The gdp_ etc. are the stubs that will be the new variable names, and i(Country) sets the identifier.

Then you can fill forward within each observation using time-series variables:

encode Country, generate(Country_num
xtset Country_num _j
replace gdp_=l.gdp_ if mi(gdp_) & !mi(l.gdp_)
Arthur Morris
  • 902
  • 10
  • 18