Notes

  • "(br)" stands for "bounded and rounded scores" (equivalent to "round_trim" in the main report)
  • "(b)" stands for "bounded scores" (equivalent to "trim" in the main report)
  • The models evaluated here are trained on features values truncated to fall within 4 standard deviations of their training set mean (i.e., not on the raw features values).
  • All correlations (feature-score correlation matrices, marginal correlations with human scores, partial correlations and partial without length) are computed on truncated and standardized values.
  • The item-level results are based on rounded, trimmed scores from ordinal least squares regression models.
  • The feature statistics, box plots, etc. are based on the training set. Only the evaluations of the model use the test set.
  • In the boxplots for feature values, the red dotted line indicates the threshold for truncation (Mean +/- 4*SD).
  • Percentiles are computed using SAS definition (i.e., "Type 3" as described here. The mild outliers are defined as data points between [1.5, 3) * IQR away from the nearest quartile. Extreme outliers are the data points >= 3 * IQR away from the nearest quartile.
  • Principal component analyses use singular value decomposition and are computed after standardization and outlier truncation.