Simply put,
"Append" is for vertical data combination.
"Merge" is for horizontal data combination.
For merge, always choose the bigger data set as in memory and merge a smaller data set, which can save some time.
PS: also choose many:many option.
Sunday, September 8, 2013
Saturday, June 9, 2012
Some thoughts on robust regression (1)
A parameter can have several estimators. Among these, the "best estimators" are always assumed to have some special features that can be exploited. For an instance, the CLR model has several special assumptions, and if some of the properties are allowed to deviate from these assumptions, new estimation techniques must be applied, such as introducing dummy variables or 2sls.
Since the "best" estimators are designed to use these assumptions, if some of the assumptions are violated, the "best" estimators "get hurt" more seriously than other estimators. Thus two kinds of special estimators should be considered and we call them "robust" estimators. One kind of these estimators is an estimator that is not sensitive to the violations of the assumptions. We could also think about a very common situation where we are not able to know if any of the assumptions are violated because of limited information or sample size. At this point, we may prefer a less "best" estimator that are neither as good as the "best" ones, nor sensitive to the violations of assumptions, i.e., some violations of the assumptions that a estimator may have to suffer does not screw up all the models. Or it can be considered as "least worst" estimator. The first kind of robust, i.e. insensitive estimator, is very common, for an example,the OLS estimator itself is considered as a robust estimator.
Another kind of robust estimator is designed to resist the violations. I believe if you have taken a intermediate level econometric training, you must know such estimators, but you may not know it is called "robust estimator". The most commonly seen "robust" estimator is the ones used to correct heteroscedasticity. One example is about the structural changes. We cannot estimated the var-cov matrix directly. We have to estimated two var-cov matrices and using some kind of weigh techniques to jointly determine the estimate of var-cov matrix. This example was seen in an exam of my masters econometric class.
To be continued...
Since the "best" estimators are designed to use these assumptions, if some of the assumptions are violated, the "best" estimators "get hurt" more seriously than other estimators. Thus two kinds of special estimators should be considered and we call them "robust" estimators. One kind of these estimators is an estimator that is not sensitive to the violations of the assumptions. We could also think about a very common situation where we are not able to know if any of the assumptions are violated because of limited information or sample size. At this point, we may prefer a less "best" estimator that are neither as good as the "best" ones, nor sensitive to the violations of assumptions, i.e., some violations of the assumptions that a estimator may have to suffer does not screw up all the models. Or it can be considered as "least worst" estimator. The first kind of robust, i.e. insensitive estimator, is very common, for an example,the OLS estimator itself is considered as a robust estimator.
Another kind of robust estimator is designed to resist the violations. I believe if you have taken a intermediate level econometric training, you must know such estimators, but you may not know it is called "robust estimator". The most commonly seen "robust" estimator is the ones used to correct heteroscedasticity. One example is about the structural changes. We cannot estimated the var-cov matrix directly. We have to estimated two var-cov matrices and using some kind of weigh techniques to jointly determine the estimate of var-cov matrix. This example was seen in an exam of my masters econometric class.
To be continued...
Monday, May 14, 2012
How to summarize the frequency of selected data in Excel
Admitting that the research I am currently doing is totally worthless, I concluded the way to summarize the frequency of selected data in Excel.
1. Generating criteria
For example, you believe in the data set, there are only several unique values such as "1, 2, 4, 5" though the data set is very large. Then create the column of these for unique values.
2. Using functions
Use "=frequent(data range, criteria range)" and select corresponding data and criteria range.
3. Aftermath
First select enough number of continuous empty cells. By enough, it means the the number of the empty cells should be equal to the number of the unique values-1 (the first cell is just where the frequent function is input in step2).
Second, press F2
Third, press "Shift" + "Control" + "Enter".
All is done now.
1. Generating criteria
For example, you believe in the data set, there are only several unique values such as "1, 2, 4, 5" though the data set is very large. Then create the column of these for unique values.
2. Using functions
Use "=frequent(data range, criteria range)" and select corresponding data and criteria range.
3. Aftermath
First select enough number of continuous empty cells. By enough, it means the the number of the empty cells should be equal to the number of the unique values-1 (the first cell is just where the frequent function is input in step2).
Second, press F2
Third, press "Shift" + "Control" + "Enter".
All is done now.
Sunday, April 29, 2012
Discussions on Logit, multinomial Logit and conditional Logit models
Simply put, discrete choice models are based on utility maximization theorem. Logit model means the random component follows Logit distribution. Multinomial means the expected utilities hij are modeled in terms of characteristics of the individuals. Conditonal means the expected utility is based on characteristics of choices.
Tuesday, April 24, 2012
How to wrap long equations automatically in Lyx
Here, we met another issue regarding equation editing in Lyx: the long equations do not wrap automatically. There is a 3-step method to solve this problem.
1. First make sure you have installed the hm package, which includes breqn package.
2. Change Math tab in document settings to hm~~. Be sure only let this option checked.
3. Add the following code to the preample:
\usepackage{breqn}
% Add support for automatic equation breaking
\gdef\wrap@breqn@environ#1#2{
\expandafter\let\csname breqn@oldbegin@#1\expandafter\endcsname\csname #1\endcsname
\expandafter\let\csname breqn@oldend@#1\expandafter\endcsname\csname end#1\endcsname
\expandafter\gdef\csname breqn@begin@#1\endcsname{%
\expandafter\let\csname #1\expandafter\endcsname\csname breqn@oldbegin@#1\endcsname%
\begin{#2}%
}
\expandafter\gdef\csname breqn@end@#1\endcsname{%
\expandafter\let\csname end#1\expandafter\endcsname\csname breqn@oldend@#1\endcsname%
\end{#2}%
\expandafter\let\csname #1\expandafter\endcsname\csname breqn@begin@#1\endcsname%
\expandafter\let\csname end#1\expandafter\endcsname\csname breqn@end@#1\endcsname%
}
\expandafter\let\csname #1\expandafter\endcsname\csname breqn@begin@#1\endcsname
\expandafter\let\csname end#1\expandafter\endcsname\csname breqn@end@#1\endcsname
}
\wrap@breqn@environ{equation}{dmath}
\wrap@breqn@environ{equation*}{dmath*}
Now choose "displayed formula" to create or edit equations.
That's it!
1. First make sure you have installed the hm package, which includes breqn package.
2. Change Math tab in document settings to hm~~. Be sure only let this option checked.
3. Add the following code to the preample:
\usepackage{breqn}
% Add support for automatic equation breaking
\gdef\wrap@breqn@environ#1#2{
\expandafter\let\csname breqn@oldbegin@#1\expandafter\endcsname\csname #1\endcsname
\expandafter\let\csname breqn@oldend@#1\expandafter\endcsname\csname end#1\endcsname
\expandafter\gdef\csname breqn@begin@#1\endcsname{%
\expandafter\let\csname #1\expandafter\endcsname\csname breqn@oldbegin@#1\endcsname%
\begin{#2}%
}
\expandafter\gdef\csname breqn@end@#1\endcsname{%
\expandafter\let\csname end#1\expandafter\endcsname\csname breqn@oldend@#1\endcsname%
\end{#2}%
\expandafter\let\csname #1\expandafter\endcsname\csname breqn@begin@#1\endcsname%
\expandafter\let\csname end#1\expandafter\endcsname\csname breqn@end@#1\endcsname%
}
\expandafter\let\csname #1\expandafter\endcsname\csname breqn@begin@#1\endcsname
\expandafter\let\csname end#1\expandafter\endcsname\csname breqn@end@#1\endcsname
}
\wrap@breqn@environ{equation}{dmath}
\wrap@breqn@environ{equation*}{dmath*}
Now choose "displayed formula" to create or edit equations.
That's it!
Monday, April 23, 2012
Working paper series: Carbon dioxide emission and environmental policies
Economists have been seeking accurate assessment of effective environmental policies in face of imminent threat of global warming partially resulted from carbon dioxide emissions. In practice, these policies can be classified into two categories: one is to encourage the use of zero emission or less emission production methods by utilizing renewable resources and to stimulate the development of environment-friendly technology, and the other one is to discourage or to punish the excessive carbon dioxide emissions. (tbc)
Saturday, April 21, 2012
Subscribe to:
Posts (Atom)