设为首页 加入收藏
你目前位置: 首页 正文

实验室动态

重点实验室团队发布手语语料库NationalCSL-DP

The NationalCSL-DP dataset

       

        The NationalCSL-DP dataset was developed by the Sichuan Province Key Laboratory of Philosophy and Social Science for Language Intelligence in Special Education and the Key Laboratory of Internet Natural Language Intelligent Processing of Sichuan Provincial Education Department.

        The NationalCSL-DP dataset contains the most extensive vocabulary compared with the existing public ISLR (Isolated Sign Language Recognition) datasets. It contains 6707 glosses from CNSL (Chinese National Sign Language vocanbulary) and provides 134140 sign videos with two vertical views of signer, i.e. the front side and the left side. For the development of the NationalCSL-DP dataset, 10 participants were recruited, including 2 males and 8 females, with a mean age of 19.82±0.28 years. Among them, 8 were deaf students, and 2 were hearing students, all of whom were highly proficient in CNSL. The videos were recorded in a supervised environment with two green-screen studios. Each of these studios was furnished with two high-definition RGB cameras. All the cameras were configured to record videos at a resolution of 1920×1080 pixels and a frame rate of 50 frames per second. Each gloss in the vocabulary was signed by ten signers. To our knowledge, this is the first ISLR dataset that provides dual-view RGB videos and covers the complete glosses in the CNSL vocabulary.

        Similar to some popular ISLR datasets (e.g. WLASL, MSASL), we created five subsets from the original dataset, each containing a distinct number of glosses. These subsets are named NationalCSL200, NationalCSL500, NationalCSL1000, NationalCSL2000, and NationalCSL6707, respectively. This systematic division allows for more targeted experimentation and analysis within the context of sign language research, enabling us to evaluate how different levels of data complexity impact the performance of sign recognition systems.

      The dataset is released under the CC-BY license (CC BY 4.0).


        The dataset is released here: 

        Participant_01

        Participant_02

        Participant_03

        Participant_04

        Participant_05

        Participant_06

        Participant_07

        Participant_08

        Participant_09

        Participant_10


        The  partitions of the NationalCSL-DP dataset include: 

        NationalCSL200

        NationalCSL500

        NationalCSL1000

        NationalCSL2000

        NationalCSL6707


        If you have any question, please contact: syjing628@126.com

        

    Ethical approval: All participants provided informed consent forms for the sharing of their identity information and signed agreements to consent to participate in the construction of the NationalCSL-DP dataset, as well as to allow the dataset to be published, including but not limited to academic journals and online databases. The Ethical Review Board (ERB) of Leshan Normal University reviewed our ethical review application, as well as the informed consent forms and agreements of all participants regarding the sharing of identity information as well as the dataset publication. Finally, permission was granted by the ERB of LSNU for the open publication of the dataset, including manuscript submission and dataset release (Ethical Review Number: LNU-KYLL2025-02-15).