chinese-social-data

Community

Access and preprocess China social datasets fast.

AuthorYuuqq
Version1.0.0
Installs0

System Documentation

What problem does it solve?

It removes friction in finding, loading, cleaning, harmonizing, and preparing Chinese social science data so you can move from raw files to analysis-ready datasets quickly.

Core Features & Use Cases

  • Survey dataset access & loading: Handles common Chinese survey microdata formats (Stata/SPSS/SAS) and typical encoding issues.
  • Cleaning & harmonization workflows: Converts survey-specific missing-value codes to standard missing values and harmonizes variables across waves (e.g., CGSS).
  • China-specific administrative and text workflows: Supports administrative division code parsing (GB/T 2260) and Chinese text preprocessing (jieba segmentation, stopwords, term dictionary).
  • Social media collection & privacy guardrails: Provides a research-oriented approach for collecting and cleaning social media text while emphasizing PIPL compliance and de-identification.

Quick Start

Load your CGSS or CFPS dataset file, clean its missing-value codes, harmonize key variables across waves, and output an analysis-ready pandas DataFrame for downstream modeling.

Dependency Matrix

Required Modules

numpypandasjiebarequeststimere

Components

scriptsreferencesassets

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: chinese-social-data
Download link: https://github.com/Yuuqq/claude-social-science-skills/archive/main.zip#chinese-social-data

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.