The phenotypic heterogeneity of Autism Spectrum Disorders (ASD) presents particular research challenges in the assessment of symptom severity, while the standardized Autism Diagnostic Observation Schedule (ADOS) scores present a severity metric, namely calibrated severity scores (CSS) that are relatively impervious to individual characteristics. To date, no studies have examined the convergent validity of CSS in Chinese sample populations. The present study investigated the validity of the ADOS-CSS using a sample of 321 children aged 2-18 years with ASD, and developed upon existing literature examining the influence of non-ASD-specific characteristics on other types of measures including Autism Diagnostic Interview-Revised (ADI-R), Social Responsiveness Scale (SRS), and Vineland Adaptive Behavior Scales (VABS). As expected, the findings revealed that the CSS were less influenced than ADOS-RAW scores by the demographic and developmental-level variables. Moreover, compared to the ADOS-CSS, the ADI-R, SRS and VABS were still strongly correlated with confounding factors, such as chronological age, intelligence quotients, and language-level. The results of this study corroborate the utilization of CSS as a more valid indicator of ASD severity than raw scores from ADOS and other instruments.