# dcnn-nlp **Repository Path**: linius/dcnn-nlp ## Basic Information - **Project Name**: dcnn-nlp - **Description**: An implementation of ACL2014 paper "A Convolutional Neural Network for Modelling Sentences" - **Primary Language**: Python - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 10 - **Forks**: 1 - **Created**: 2014-06-17 - **Last Updated**: 2023-10-27 ## Categories & Tags **Categories**: ai **Tags**: None ## README #dcnn-nlp (In development) ========================================================== dcnn-nlp是一款使用卷积神经网络进行自然语言处理以及文本分类的工具。参考2014ACL论文"A Convolutional Neural Network for Modelling Sentences"实现并扩展。
它具有以下特征:
Examples ========================================================== ```python # Stanford Sentiment Treebank Experiment # You should run python prepare.py in the data/stanford direction firstly total_data_file = 'data/stanford/total.data' total_sentences = LineSentence(total_data_file, repeat=5) train_data_file = 'data/stanford/train2.data' train_label_file = 'data/stanford/train2.label' train_sentences = LineSentence(train_data_file) train_labels = numpy.fromfile(train_label_file, sep='\n', dtype=numpy.int32) dev_data_file = 'data/stanford/dev2.data' dev_label_file = 'data/stanford/dev2.label' dev_sentences = LineSentence(dev_data_file) dev_labels = numpy.fromfile(dev_label_file, sep='\n', dtype=numpy.int32) test_data_file = 'data/stanford/test2.data' test_label_file = 'data/stanford/test2.label' test_sentences = LineSentence(test_data_file) test_labels = numpy.fromfile(test_label_file, sep='\n', dtype=numpy.int32) # n_filters=[6,14] in the paper # n_filters=[4,6] in LeNet # But you can go deeper model = DCNNDeep(sentences=train_sentences, output_layer_size=2, wordvec_dim=48, alpha=0.012, entropy_descent_m=0.995, dropout_rate_in_hiddens=0.5, dropout_rate_in_input=0.2, min_count=2, full_con_layer_size=5, filter_width=[7,5,3], k_top=4, n_filters=[6,14,6], alpha_m=0.999995, min_alpha=0.00001, pre_train_word_vec=True, pre_train_sentences=total_sentences) model.train(train_sentences=train_sentences, train_labels=train_labels, patience=5, validate_freq=2000, max_entropy_allowed=0.38, validate_sentences=dev_sentences, validate_labels=dev_labels, chunksize=5) print 'test accuracy: %f' %model.accuracy(test_sentences, test_labels) ```