GFM: Generalized factor model for ultra-high dimensional variables with mixed types.
GFM is a package for analyzing the (ultra)high dimensional data with mixed-type variables, developed by the Huazhen Lin’s lab. It is not only computationally efficient and scalable to the sample size increment, but also is capable of choosing the number of factors. In our JASA paper, a two-step method is proposed to estimate the factor and loading matrix, in which the first step used the alternate maximization (AM) algorithm to obtain initial estimator. In the paper, the information criterion was provided to determine the number of factors. Recently, we proposed an overdispersed generalized factor model (OverGFM) and designed a variational EM algorithm to implement OverGFM. A singular value ratio based method was provided to determine the number of factors. In addition, the estimate from OverGFM can be also used as the initial estimates in the first step for GFMs in our previous JASA paper.
Check out our JASA paper for alternate maximization and information criterion, and our Package vignette for a more complete description of the usage of GFM and OverGFM.
GFM can be used to analyze experimental dataset from different areas, for instance:
Please see our new paper for model details:
Wei Liu, Huazhen Lin, Shurong Zheng & Jin Liu (2021) . Generalized factor model for ultra-high dimensional mixed data. Journal of the American Statistics Association (Online).
To install the the packages ‘GFM’ from ‘Github’, firstly, install the ‘remotes’ package.
Or install the the packages “GFM” from ‘CRAN’
For an example of typical GFM usage, please see our Package vignette for a demonstration and overview of the functions included in GFM.
GFM version 1.2.1 (2023-08-10)
overdispersedGFM() that implements the
overdispersed generalized factor model is added. In addition, the
OverGFMchooseFacNumber() is added, which
implements singular value ratio (SVR) based method to select the number