One of SC's favorite sessions at the 2009 LSA meeting was titled "Computational Linguistics: Implementation of Analyses against Data". Go here for a listing of the papers (it's session #30). There was a very conscious effort this year, driven very clearly by the efforts of Emily Bender and Terry Langendoen (they had a joint session to themselves earlier for this purpose), to present computational methods as desirable technical approaches to handling theoretical issues, which is exactly the sort of thing your host has always wanted to see develop further. Herewith, a little about each of the talks:
Emily Bender kicked off the discussion with a presentation on a grammar she built for the extinct language Wambaya (making use of 801 examples drawn from the documentation in Rachel Nordlinger's dissertation). Ordinarily, testing all sorts of licensing constraints and making sure that your newer rules don't break your older rules is a process that can take months. However, with the aid of the Grammar Matrix, a tool for writing and testing analyses in the Head-Driven Phrase Structure Grammar formalism, she managed to produce a grammar that correctly analyzed 91% of the cases in her development set, and 76% of cases in a separate test set, spending 210 hours in 5 1/2 weeks to accomplish this task. The introduction of formal test and development methods into the construction of theoretical analyses is welcome, and the steadily rising graph she presented to document the improvements in the grammar as a function of time was frankly astounding. If the only thing anyone took away from the presentation was that they should bring a genuine test plan into their work and actually keep metrics of their work as it progresses, the talk was a success. That it made such a convincing case for the utility of automated parsing and generation as core tools in doing theoretical work is a dream come true.
Next up was a presentation by Jason Baldridge and Katrin Erk of progress in a research project titled Efficient Annotation of Resources by Learning. Their team is tackling the problem of constructing interlinear glosses for text in languages where little prior data is available -- a problem for just about any small minority language in the world, and hence one where an efficient computational solution could reap enormous rewards (scientifically -- the IPO might be a bit more of a pipe dream). For the LSA talk, they described an experiment where 2 trained linguists were given 100,000 clauses of a Mayan dialect called Uspanteko (you can see an example at the project wiki), one of whom was a speaker of the language, and the other of whom was a theoretically knowledgable individual with no Uspanteko experience. The question posed was: how much can you gloss in 2 weeks with a little help from a computer? And the answer appears to be: with random selections from the corpus (to keep from overtraining on sequential -- and possibly contiguous -- material), enough to get a machine learning algorithm to predict labels for the entire corpus with about 30% accuracy. That's not good enough to leave the job to the machine, obviously, but it is good enough to already help rank possible tags for a user to speed up their manual annotation, which is exactly the application they're developing. If you've never tried to use an annotation interface that doesn't know anything about what you're up to -- or worse, tried to do it in a plain-text editor -- trust SC when he tells you that any further progress these folks make will be a blessing.
Following the EARL team, Nianwen Xue, Susan Brown and Martha Palmer presented a paper titled "Computational lexicons: When theory meets data", covering work on building a computational lexicon integrating data from a number of prior projects, which you can browse here. Specifically, they wanted to provide a resource combining the semantic role data found in PropBank (a treebank that encodes data about verb arguments in real sentences) with VerbNet, a very detailed implementation of Beth Levin's work on verb classes. The reason you would want this integration is that sense data is notably lacking from the PropBank, itself an extension of the Penn Treebank, and this is a Bad Thing when trying to train a parser to assign semantic roles to new text. The tagging procedure by which they accomplish their integration is sensible enough, albeit not something to write much about, but the import of the work is clear -- you really can build a computational resource that is faithful to both the needs of statistical parsing and generation algorithms and linguistic theory. It's not hard to imagine building a variety of potentially very interesting applications using a word-sense-aware parser backed by this lexicon, because a little semantic role data is a lot better than nothing at all.
Next up, Jason Riggle and John Goldsmith presented a paper with a too-rare title, "Information-theoretic approaches to phonology", which appears to be an update of this 2007 manuscript. Prof. Goldsmith gave a plenary address at the previous LSA meeting on computational methods, based on this paper, which provoked a certain amount of misunderstanding and suspicion that he was somehow not interested in finding out what was going on inside people's heads when they use language. Nothing could be further from the truth; the current paper demonstrates how the classic autosegmental theory of phonological tiers could be expressed in terms of probabilities for both consonant and vowel segments. More than that, it introduces the use of a genuinely zero-based metric for evaluating the quality of a phonological model, by tying the comparison of models to the number of bits needed to represent segments and words. Now, SC would stipulate that it is not at all clear that the language apparatus always and everywhere chooses the most efficient coding scheme that could be computed. However, as a metric for evaluating whether or not a particular theory has explanatory power, this is an excellent approach. If you can't show that your theory actually buys you something better than a naive n-gram model, you had better have some other compelling reason for adopting your proposal. Indeed, the autosegmental model actually was not the most efficient from a bits/symbol perspective, but the evidence for tiers is compelling enough to not discard them in favor of flat bigrams.
Finally, the talk that most excited SC was saved for last -- Christopher Potts presenting work with Florian Schwarz on getting pragmatic data out of reviews from TripAdvisor and Amazon. The methodology is brilliantly simple: these sites give you a convenient 5-point scale for rating things, with clearly defined negative and positive opinions. So count up associations of ratings with words, and you've got yourself a taxonomy of emotional baggage. Leaving the details of the computation to the linked paper, the paper demonstrated that "what a" tends to be a useful signal of heightened emotion:
- What a dump!
- What a nice hotel!
- What a completely quite neutral reaction I'm faking to throw off the math!
In all seriousness, phrases like "what a" are found to show up in both 1- and 5-star reviews, indicating extremity of reaction (although not polarity), while other words have more clearly directional connotations, like "wow" (positive) and "never" (negative). Even with noise of the sort introduced above, Potts and Schwarz show their results to be remarkably robust, with spurious examples of the relevant constructions to occur with frequencies that are orders of magnitude below the cases of interest. These are the sort of lessons one would ordinarily learn through survey-based research with lots of manually tabulated results and much smaller quantities of data. As a pure language-engineering tool, the applications are obvious -- it's easy to imagine conducting tests to start classifying all sorts of words as emotionally laden, positive, negative, and so forth, and integrating that into software that acts on opinions. As a research tool for theoretical inquiry, one can just as easily imagine constructing a program to serve as a filter for finding examples deserving closer scrutiny in a corpus.
What all of the papers from this symposium have in common is a commitment to the utility of theoretical linguistics, combined with an equally fervent commitment to the idea that systematic counting of examples is a legitimate way to validate your theories. The notion that a good theory ought to be able to survive contact with data doesn't require an abandonment of theoretical work in itself, and bringing a formal development cycle to your work is simply a dose of good-for-you discipline.
Sw3UKL, http://twatter.com/profiles/blogs/atrovent-kaufen atrovent cost, Xd2YNB, atrovent drug guide, atrovent, Nt1TUN, atrovent drug info, http://twatter.com/profiles/blogs/atrovent-kaufen atrovent cost, Az0QWT, atrovent drug card, Ml5KTL
Posted by: irxezalx | August 18, 2011 at 12:10 AM
Y8y0Fr, buy ultracet, J2t4By, buy ultracet, http://www.rampagenetwork.com/community/viewtopic.php?f=17&t=183224 buy ultracet, I1p1Gp, ultracet drug buy, http://www.rampagenetwork.com/community/viewtopic.php?f=17&t=183224 ultracet online, V7j5Ci, buy ultracet online, Q3s1Qs
Posted by: pgkhxsyi | August 23, 2011 at 01:02 AM
Bm4qc12w, order combivent, Br4bm69z, combivent nebulizer drug study, http://n1best.com/wholesaleforums/showthread.php?tid=301396 combivent for sale, Nz9sa55a, combivent without rx, http://n1best.com/wholesaleforums/showthread.php?tid=301396 purchase combivent, Fy7ow52b, combivent no rx, Vn9lo58k
Posted by: wgmljrmf | August 30, 2011 at 12:58 AM
9y82e, http://fedtothepigs.com/forum/index.php?topic=22930.0 >ativan no rx, 3g82j, ativan eyes ted leo and the pharmacists, http://fedtothepigs.com/forum/index.php?topic=22930.0 ativan online, 2h17d, ativan good drug, purchase ativan, 5q61i, ativan canada pharmacy, 6q89l
Posted by: niplfgan | September 02, 2011 at 01:09 AM
WnGm9, cheap rivotril, EiKm4, buy rivotril online cheap, http://www.usebn.com/forum/topics/cheap-rivotril-2mg-online rivotril online, YyIv6, rivotril tablets 0.5mg, http://www.usebn.com/forum/topics/cheap-rivotril-2mg-online >rivotril no prescription, DpJy5, rivotril drug info, FnLs0
Posted by: egwmrwon | September 05, 2011 at 12:37 AM
GHD Hair Straighteners must be blocked halter style, Motorcycle Fairings you will not like this color up. Franklin and Marshall clothing But this neat shape,The North Face Shoes if a little weak for
Posted by: franklinmarshall | December 29, 2011 at 12:29 AM
snorting carisoprodol carisoprodol schedule http://www.wnet.co.il/forum/showthread.php?t=287541 carisoprodol abuse carisoprodol 350
Posted by: kuwveetw | January 25, 2012 at 12:43 AM
acomplia slimming pills acomplia rimonabant tabletten http://www.wnet.co.il/forum/showthread.php?t=341840 purchase acomplia acomplia femme enceinte
Posted by: qarwjfzh | January 26, 2012 at 05:38 AM
carisoprodol compound what is carisoprodol used for http://www.wnet.co.il/forum/showthread.php?t=25568 carisoprodol uses carisoprodol no prescription
Posted by: zovagvlh | January 28, 2012 at 08:06 PM
diazepam for sale uk compare diazepam prices valium generic diazepam diazepam 90 mg diazepam valium for sale
valium 2 mg valium canadian pharmacy no prescription http://www.wnet.co.il/forum/showthread.php?t=393823 valium prescription drugs online valium buy uk valium cheap
xanax prescription online xanax as sleeping tablet http://www.wnet.co.il/forum/showthread.php?t=78322 xanax online from mexico xanax online canada xanax european pharmacy
zopiclone tablets 3.75mg buy cheap zopiclone http://phpdirector.co.uk/community/showthread.php?tid=17104 buy zopiclone online zopiclone to buy in canada pms zopiclone 7.5 mg zopiclone
provigil rebate offer provigil modafinil uk http://www.wnet.co.il/forum/showthread.php?t=24886 get provigil prescription provigil 200 mg tablets used provigil uk pharmacy
carisoprodol online carisoprodol discount http://www.wnet.co.il/forum/showthread.php?t=149997 carisoprodol 1400 mg carisoprodol 400 mg
Posted by: naatsilt | January 31, 2012 at 01:01 AM
buy cheap adipex online adipex is generic for diet pills http://phpdirector.co.uk/community/showthread.php?tid=6340 buy adipex with mastercard phentermine and adipex diet pills order adipex without rx
ambien and lunesta sleeping pills ambien and lunesta sleeping pills http://www.wnet.co.il/forum/showthread.php?t=24878 lunesta tablets prices lunesta goes generic 30 mg lunesta
acomplia achat xenical acomplia rimonabant buy http://phpdirector.co.uk/community/showthread.php?tid=17078 generic acomplia online pharmacy cheap acomplia cialis acomplia generique avis
diazepam 5mg tablet diazepam online buy cheap http://www.wnet.co.il/forum/showthread.php?t=128285 diazepam canada online is diazepam generic
pills adipex phentermine cheap phentermine 37.5 phentermine tablets online phentermine fastin 30 mg canadian pharmacy phentermine 37.5
Posted by: wybpwobv | January 31, 2012 at 03:12 PM