Generate a variable based on the most recent I/observation

Question

My data is currently organized in Stata as follows:

input str2 Country gdp_2015  gdp_2016  gdp_2017  imports_2016  imports_2017   exports_2016
"A"         11        12        13       5             6                 8               5 
"B"         11         .        .        5             6                 10               5 
"C"        12          13       .        5             6                  8               5 
end

gen net_imports = (imports_2017-foodexport_2017)

gen net_imports_toGDP = (net_imports/gdpcurrent_2017)

The code works well but only created a variable if a country has 2017 data, but I would like to essentially create an import to GDP ratio, based on the most recent observation available for GDP.

Arthur Morris · Accepted Answer · 2020-11-18T02:08:29.453

4

You could simply replace the missing data as follows:

replace gdp_2016 = gdp_2015 if mi(gdp_2016)
replace gdp_2017 = gdp_2016 if mi(gdp_2017)

However, a more general approach would begin by reshaping your data from wide to long:

reshape long gdp_ imports_ exports_, i(Country)

See help reshape for more detail on the command. The gdp_ etc. are the stubs that will be the new variable names, and i(Country) sets the identifier.

Then you can fill forward within each observation using time-series variables:

encode Country, generate(Country_num
xtset Country_num _j
replace gdp_=l.gdp_ if mi(gdp_) & !mi(l.gdp_)

edited Nov 18 '20 at 02:08

answered Nov 18 '20 at 01:55

Arthur Morris

902
10
18

1

Note that I'm making some guesses about your actual dataset, because the description of the data is not complete. – Arthur Morris Nov 18 '20 at 01:57
1

I've added information about the reshape command for reference. – Arthur Morris Nov 18 '20 at 02:06
2

A `reshape long` is indeed strongly recommended here. – Nick Cox Nov 18 '20 at 09:31

Generate a variable based on the most recent I/observation

1 Answers1