STANCE、FNC、MultiNLI、TOPIC、Laptops、restaurants、target-dependent数据集获取方式(来自2018NAACL-HLT的Multi-task Lea
论文的github地址:https://github.com/coastalcph/mtl-disparate
里面有一些像SemEval的网址http://alt.qcri.org/semeval2016/task4/
FNC的网址http://www.fakenewschallenge.org/
#!/usr/bin/env bash # Download the SemEval 2016 Task 6 Stance detection dataset mkdir semeval2016-task6-stance ; cd semeval2016-task6-stance wget http://alt.qcri.org/semeval2016/task6/data/uploads/stancedataset.zip wget http://alt.qcri.org/semeval2016/task6/data/uploads/semeval2016-task6-trialdata.txt curl -L "https://drive.google.com/uc?export=download&id=0B2Z1kbILu3YtenFDUzM5dGZEX2s" > downloaded_Donald_Trump.txt #找不到!!! unzip stancedataset.zip -d . ; mv StanceDataset/* . rm stancedataset.zip ; rm -r StanceDataset __MACOSX cd .. # Download the Fake News Challenge datset mkdir fakenewschallenge ; cd fakenewschallenge wget https://raw.githubusercontent.com/FakeNewsChallenge/fnc-1/master/competition_test_stances.csv wget https://raw.githubusercontent.com/FakeNewsChallenge/fnc-1/master/competition_test_bodies.csv wget https://github.com/FakeNewsChallenge/fnc-1/archive/master.zip unzip master.zip -d . ; mv fnc-1-master/* . rm -r fnc-1-master ; rm master.zip cd .. # Download the Multi-NLI dataset mkdir multinli ; cd multinli wget http://www.nyu.edu/projects/bowman/multinli/multinli_0.9.zip unzip multinli_0.9.zip -d . ; mv multinli_0.9/* . rm multinli_0.9.zip ; rm -r multinli_0.9 cd .. # Download the SemEval 2016 Task 4 Subtask B Topic-based Twitter sentiment analysis dataset mkdir semeval2016-task4b-topic-based-sentiment ; cd semeval2016-task4b-topic-based-sentiment curl -L "https://drive.google.com/uc?export=download&id=0B3emjZ5O5vDtSGpKcjQ3cnhldmc" > semeval2016_task4b_topic-based_sentiment.zip #找不到!!! unzip semeval2016_task4b_topic-based_sentiment.zip -d . rm semeval2016_task4b_topic-based_sentiment.zip cd .. # Download the SemEval 2016 Task 4 Subtask C Topic-based 5-way Twitter sentiment analysis dataset mkdir semeval2016-task4c-topic-based-sentiment ; cd semeval2016-task4c-topic-based-sentiment curl -L "https://drive.google.com/uc?export=download&id=1eS67x5vedrzVVk-tcyKSrumigbJKuqH-" > semeval2016_task4c_topic-based_sentiment.zip unzip semeval2016_task4c_topic-based_sentiment.zip -d . rm semeval2016_task4c_topic-based_sentiment.zip cd .. # Download the SemEval 2016 Task 5 Aspect-based sentiment analysis dataset mkdir semeval2016-task5-absa-english ; cd semeval2016-task5-absa-english curl -L "https://drive.google.com/uc?export=download&id=0B3emjZ5O5vDtbTJnUHRIdFBULTg" > semeval2016_task5_absa_english.zip #找不到!!! unzip semeval2016_task5_absa_english.zip -d . rm semeval2016_task5_absa_english.zip cd .. # Download the target-dependent sentiment analysis dataset of Dong et al. (2014): # Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification mkdir target-dependent ; cd target-dependent curl -L "https://drive.google.com/uc?export=download&id=0B3emjZ5O5vDtTW1SZjItWFlxUUU" > target_dependent.zip #找不到!!! unzip target_dependent.zip -d . rm target_dependent.zip cd ..
上面说找不到!!的,我都找到了
ps:怎么知道自己找全了没有,找对了没有,去看作者怎么处理这个数据,文件名是什么(幸运的是作者没有乱改文件名)
pps: 可能要科学地!上网!
semeval2016-task6-stance
下载downloaded_Donald_Trump.txt找到了一篇github的issue里头藏着 https://github.com/sheffieldnlp/stance-conditional
发现其中一个作者就是我们这个论文的作者!
https://www.dropbox.com/sh/o8789zsmpvy7bu3/AABRja7NDVPtbjSa-y3GH0jAa?dl=0
semeval2016-task5-absa-english
下载laptops和restaurants
https://alt.qcri.org/semeval2016/task5/
https://alt.qcri.org/semeval2016/task5/index.php?id=data-and-tools
SemEval 2016 Task 4 Subtask B Topic-based Twitter sentiment analysis dataset
找这五个,作者改了名加了个downloaded,自己跟着改
https://alt.qcri.org/semeval2016/task4/index.php?id=data-and-tools
测试集要申请,训练集不用
gold label数据在这里头 https://alt.qcri.org/semeval2016/task4/index.php?id=results
_script文件夹下
acl14-target-dataset
介绍数据集的论文《Effective LSTMs for Target-Dependent Sentiment Classification》
https://aclanthology.org/P14-2009.pdf里的
http://goo.gl/5Enpu7,这里头进行申请
download link:
(1) Google Drive: https://drive.google.com/file/d/0B8yp1gOBCztyVVVoLTdNZ1JHYVU/edit?usp=sharing
(2) Baidu Drive: http://pan.baidu.com/s/1qWAYsWG
找完了也不用多久吧,就一周