STANCE、FNC、MultiNLI、TOPIC、Laptops、restaurants、target-dependent数据集获取方式(来自2018NAACL-HLT的Multi-task Lea


论文的github地址:https://github.com/coastalcph/mtl-disparate 

里面有一些像SemEval的网址http://alt.qcri.org/semeval2016/task4/

FNC的网址http://www.fakenewschallenge.org/

#!/usr/bin/env bash

# Download the SemEval 2016 Task 6 Stance detection dataset
mkdir semeval2016-task6-stance ; cd semeval2016-task6-stance
wget http://alt.qcri.org/semeval2016/task6/data/uploads/stancedataset.zip
wget http://alt.qcri.org/semeval2016/task6/data/uploads/semeval2016-task6-trialdata.txt
curl -L "https://drive.google.com/uc?export=download&id=0B2Z1kbILu3YtenFDUzM5dGZEX2s" > downloaded_Donald_Trump.txt #找不到!!!
unzip stancedataset.zip -d . ; mv StanceDataset/* .
rm stancedataset.zip ; rm -r StanceDataset __MACOSX
cd ..

# Download the Fake News Challenge datset
mkdir fakenewschallenge ; cd fakenewschallenge
wget https://raw.githubusercontent.com/FakeNewsChallenge/fnc-1/master/competition_test_stances.csv
wget https://raw.githubusercontent.com/FakeNewsChallenge/fnc-1/master/competition_test_bodies.csv
wget https://github.com/FakeNewsChallenge/fnc-1/archive/master.zip
unzip master.zip -d . ; mv fnc-1-master/* .
rm -r fnc-1-master ; rm master.zip
cd ..

# Download the Multi-NLI dataset
mkdir multinli ; cd multinli
wget http://www.nyu.edu/projects/bowman/multinli/multinli_0.9.zip
unzip multinli_0.9.zip -d . ; mv multinli_0.9/* .
rm multinli_0.9.zip ; rm -r multinli_0.9
cd ..

# Download the SemEval 2016 Task 4 Subtask B Topic-based Twitter sentiment analysis dataset
mkdir semeval2016-task4b-topic-based-sentiment ; cd semeval2016-task4b-topic-based-sentiment
curl -L "https://drive.google.com/uc?export=download&id=0B3emjZ5O5vDtSGpKcjQ3cnhldmc" > semeval2016_task4b_topic-based_sentiment.zip #找不到!!!

unzip semeval2016_task4b_topic-based_sentiment.zip -d .
rm semeval2016_task4b_topic-based_sentiment.zip
cd ..

# Download the SemEval 2016 Task 4 Subtask C Topic-based 5-way Twitter sentiment analysis dataset
mkdir semeval2016-task4c-topic-based-sentiment ; cd semeval2016-task4c-topic-based-sentiment
curl -L "https://drive.google.com/uc?export=download&id=1eS67x5vedrzVVk-tcyKSrumigbJKuqH-" > semeval2016_task4c_topic-based_sentiment.zip
unzip semeval2016_task4c_topic-based_sentiment.zip -d .
rm semeval2016_task4c_topic-based_sentiment.zip
cd ..

# Download the SemEval 2016 Task 5 Aspect-based sentiment analysis dataset
mkdir semeval2016-task5-absa-english ; cd semeval2016-task5-absa-english
curl -L "https://drive.google.com/uc?export=download&id=0B3emjZ5O5vDtbTJnUHRIdFBULTg" > semeval2016_task5_absa_english.zip #找不到!!!

unzip semeval2016_task5_absa_english.zip -d .
rm semeval2016_task5_absa_english.zip
cd ..

# Download the target-dependent sentiment analysis dataset of Dong et al. (2014):
# Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification
mkdir target-dependent ; cd target-dependent
curl -L "https://drive.google.com/uc?export=download&id=0B3emjZ5O5vDtTW1SZjItWFlxUUU" > target_dependent.zip #找不到!!!
unzip target_dependent.zip -d .
rm target_dependent.zip
cd ..

上面说找不到!!的,我都找到了

ps:怎么知道自己找全了没有,找对了没有,去看作者怎么处理这个数据,文件名是什么(幸运的是作者没有乱改文件名)

pps: 可能要科学地!上网!

semeval2016-task6-stance

下载downloaded_Donald_Trump.txt找到了一篇github的issue里头藏着 https://github.com/sheffieldnlp/stance-conditional

发现其中一个作者就是我们这个论文的作者!
https://www.dropbox.com/sh/o8789zsmpvy7bu3/AABRja7NDVPtbjSa-y3GH0jAa?dl=0


semeval2016-task5-absa-english

下载laptops和restaurants
https://alt.qcri.org/semeval2016/task5/
https://alt.qcri.org/semeval2016/task5/index.php?id=data-and-tools

 SemEval 2016 Task 4 Subtask B Topic-based Twitter sentiment analysis dataset

找这五个,作者改了名加了个downloaded,自己跟着改

https://alt.qcri.org/semeval2016/task4/index.php?id=data-and-tools

测试集要申请,训练集不用

gold label数据在这里头 https://alt.qcri.org/semeval2016/task4/index.php?id=results

_script文件夹下


acl14-target-dataset

介绍数据集的论文《Effective LSTMs for Target-Dependent Sentiment Classification》

https://aclanthology.org/P14-2009.pdf里的

http://goo.gl/5Enpu7,这里头进行申请

download link:
(1) Google Drive: https://drive.google.com/file/d/0B8yp1gOBCztyVVVoLTdNZ1JHYVU/edit?usp=sharing
(2) Baidu Drive: http://pan.baidu.com/s/1qWAYsWG

找完了也不用多久吧,就一周