Notes
- "(br)" stands for "bounded and rounded scores" (equivalent to "round_trim" in the main report)
- "(b)" stands for "bounded scores" (equivalent to "trim" in the main report)
- The models evaluated here are trained on features values truncated to fall within 4 standard deviations of their training set mean (i.e., not on the raw features values).
- All correlations (feature-score correlation matrices, marginal correlations with human scores, partial correlations and partial without length) are computed on truncated and standardized values.
- The item-level results are based on rounded, trimmed scores from ordinal least squares regression models.
- The feature statistics, box plots, etc. are based on the training set. Only the evaluations of the model use the test set.
- In the boxplots for feature values, the red dotted line indicates the threshold for truncation (Mean +/- 4*SD).
- Percentiles are computed using SAS definition (i.e., "Type 3" as described here. The mild outliers are defined as data points between [1.5, 3) * IQR away from the nearest quartile. Extreme outliers are the data points >= 3 * IQR away from the nearest quartile.
- Principal component analyses use singular value decomposition and are computed after standardization and outlier truncation.