The JDPA Opinion Corpus for the Car Space.


129 views
Uploaded on:
Description
The JDPA Slant Corpus for the Car Area . Jason S. Kessler. Miriam Eckert, Lyndsie Clark, Nicolas Nicolov J.D. Power and Partners. Indiana College. Outline. 335 blog entries containing feelings about autos 223K tokens of online journal information Objective of comment task:
Transcripts
Slide 1

The JDPA Sentiment Corpus for the Automotive Domain Jason S. Kessler Miriam Eckert, Lyndsie Clark, Nicolas Nicolov J.D. Power and Associates Indiana University

Slide 2

Overview 335 blog entries containing suppositions about autos 223K tokens of web journal information Goal of annotation undertaking: Examples of how words cooperate to assess substances Annotations encode these collaborations Entities are conjured physical articles and their properties Not simply autos, auto parts People, areas, associations, times

Slide 3

Excerpt from the corpus “last night was decent. sean purchased me caribou and we went to my home to watch the ball game … “… yesturday i helped me mother with brians house and after that we went and took a gander at a kia spectra . it looked decent, yet when we got up to it, i wasn\'t inspired ...”

Slide 4

Outline Motivating sample Overview of annotation sorts Some insights Potential employments of corpus Comparison to different assets

Slide 5

Honda Civic . John as of late obtained a PERSON CAR REFERS-TO REFERS-TO had an incredible a disillusioning motor , gently It stereo , CAR-PART CAR-PART He likewise considered a BMW and was exceptionally grippy . Individual CAR which, while better stereo . valued exceedingly had a CAR-FEATURE CAR-PART

Slide 6

TARGET Honda Civic . John as of late bought a PERSON CAR TARGET had an incredible a frustrating motor , gently It stereo , CAR-PART CAR-PART TARGET He likewise considered a BMW and was extremely grippy . Individual CAR which, while better stereo . estimated exceptionally had a CAR-FEATURE CAR-PART

Slide 7

Honda Civic . John as of late acquired a PERSON CAR REFERS-TO REFERS-TO had an extraordinary a frustrating motor , gently It stereo , CAR-PART CAR PART-OF PART-OF He likewise considered a BMW and was extremely grippy . Individual CAR FEATURE-OF PART-OF which, while better stereo . evaluated exceptionally had a CAR-FEATURE CAR-PART

Slide 8

LESS MORE Honda Civic . John as of late acquired a PERSON CAR DIMENSION had an incredible a baffling motor , gently It stereo , CAR-PART CAR-PART He likewise considered a BMW and was exceptionally grippy . Individual CAR which, while better stereo . valued very had a CAR-FEATURE CAR-PART

Slide 9

LESS Entity-level notion: positive MORE Honda Civic . John as of late bought a PERSON CAR TARGET DIMENSION REFERS-TO REFERS-TO TARGET had an awesome a frustrating motor , gently It stereo , CAR-PART CAR-PART Entity-level opinion: blended PART-OF PART-OF TARGET He additionally considered a BMW and was extremely grippy . Individual CAR TARGET FEATURE-OF which, while better stereo . estimated very had a CAR-FEATURE CAR-PART

Slide 10

Outline Motivating sample Overview of annotation sorts Some insights Potential employments of corpus Comparison to different assets

Slide 11

Entity annotations REFERS-TO John as of late bought a Civic . It had an awesome motor and was estimated well. Alludes TO John Civic It motor valued CAR-FEATURE PERSON CAR-PART >20 semantic sorts from ACE Entity Mention Detection Task Generic car sorts

Slide 12

Entity-connection annotations Entity-level estimation: Positive Relations between elements Entity-level notion annotations Sentiment stream between elements through relations My auto has an awesome motor. Honda, known for its exclusive expectations, made my auto . Community CAR PART-OF FEATURE-OF motor estimated CAR-PART CAR-FEATURE

Slide 13

Entity annotation sort: measurements Inter-annotator understanding Among notice 83% Refers-to: 68% 61K notice in corpus and 43K elements 103 reports commented by around 3 annotators MATCH A1: …Kia Rio … A2: … Kia Rio … NOT A MATCH A1: …Kia Rio … A2: … Kia Rio…

Slide 14

Sentiment expressions … an incredible motor Prior extremity: positive Evaluations Target notice Prior extremity: Semantic introduction given target positive, negative, nonpartisan, blended profoundly valued Prior extremity: negative exceedingly spec’ed Prior extremity: positive

Slide 15

Sentiment expressions Occurrences in corpus: 10K 13% are multi-word like no other , get up and go 49% are going by descriptive words 22% things ( harm, great sum ) 20% verbs ( likes , upset ) 5% qualifiers ( exceptionally )

Slide 16

Sentiment expressions 75% of feeling expression events have non evaluative uses in corpus “light” …the auto appeared to be too light to be safe… …vehicles in the light truck category… 77% assumption expression events are certain Inter-annotator assention: 75% compasses, 66% targets, 95% earlier extremity

Slide 17

Modifiers - > relevant extremity NEGATORS INTENSIFIERS an auto decent not a decent auto UPWARD an auto sort of good not a decent auto DOWNARD COMMITTERS NEUTRALIZERS I am the auto is certain great if the auto is great UPWARD I the auto is the auto is great I trust suspect great DOWNWARD

Slide 18

Other annotations Speech occasions (not sourced from creator) John thinks the auto is great. Examinations: Car X has a superior motor than auto Y. Handles a mixture of cases

Slide 19

Outline Motivating illustration Overview of annotation sorts Some insights Potential employments of corpus Comparison to different assets

Slide 20

Possible assignments Detecting notice, supposition expressions, and modifiers Identifying focuses of estimation expressions, modifiers Coreference determination Finding piece of, highlight of, and so forth relations Identifying lapses/irregularities in information

Slide 21

Possible errands Exploring how components associate: Some nitwit thinks this is a decent auto . Assessing unsupervised supposition frameworks or those prepared on different spaces How do relations between elements exchange notion? The car’s paint employment is faultless yet the wellbeing record is poor. Answer for one undertaking may be valuable in understanding another.

Slide 22

But hold up, there’s more! 180 advanced camera blog entries were explained Total of 223,001 + 108,593 = 331,594 tokens

Slide 23

Outline Motivating sample Elements consolidate to render substance level supposition Overview of annotation sorts Some insights Potential employments of corpus Comparison to different assets

Slide 24

Other resources MPQA Version 2.0 Wiebe, Wilson and Cardie (2005) Largely professionally composed news articles Subjective expression “beliefs, feelings, notions, theories, etc.” Attitude, logical assumption on subjective expressions Target, source annotations 226K tokens (JDPA: 332K)

Slide 25

Other assets Data sets gave by Bing Liu (2004, 2008) Customer-composed purchaser gadgets item surveys Contextual slant toward notice of item Comparison annotations 130K tokens (JDPA: 332K)

Slide 26

Thank you! Acquiring the corpus: Research and instructive purposes ICWSM.JDPA.corpus@gmail.com June 2010 Annotation rules: http://www.cs.indiana.edu/~jaskessl Thanks to: Prof. Michael Gasser, Prof. James Martin, Prof. Martha Palmer, Prof. Michael Mozer, William Headden

Slide 27

Top 20 annotations

Recommended
View more...