Welcome to Scribd!

Brown Christopher Ideas

Uploaded by

0% found this document useful (0 votes)

78 views1 page

C. Brown proposes a personallyadaptive Spelling Correction and style matching text-entry engine. He proposes colloquial / formal classification of texts based on cross-classification. The classifier will train on very large corpora that are labeled at the global level.

Original Description:

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

78 views1 page

Brown Christopher Ideas

Uploaded by

Christopher Brown

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 1

Search inside document

Semisupervised Project Ideas

Christopher Brown
Department of Linguistics
University of Texas at Austin
1 University Station B5100
Austin, TX 78712-0198 USA
chbrown@mail.utexas.edu

Summary set is that active learning involves the human user

telling the machine what to learn, and semisuper-
Two very different ideas: 1) A personally-
vised learning entails the machine inferring what
adaptive spelling correction and style
to learn from the user’s normal behavior (i.e. un-
matching text-entry engine. 2) Collo-
labeled data); my implementation would be a com-
quial/formal classification of texts based
bination of the two.
on cross-classification between colloquial
and formal corpora. 2 Colloquial/formal classification

1 Spelling Correction This classifier will train on very large corpora that
are labeled at the global level; I will consider the
Spelling correction is sub-optimal in many of its LDC’s switchboard corpus or the Simple English
modern implementations. While some text editing language of Wikipedia as “colloquial,” and New
programs allow personal dictionaries, none that I York Times or Wall Street Journal articles as “for-
know of track personal mistake tendencies. Two mal.” These will be very noisy data sets, but I be-
treatments of spelling correction algorithms, Juraf- lieve some traits of formality or colloquialism can be
sky and Martin (2008) and Kukich (1992), describe teased from the documents. Given those two poles
various complications of and solutions to the prob- of corpora, I will calculate the distance between
lem (the latter in much more depth), but neither con- my unlabeled, to-be-classified documents, and doc-
sider the continuous process of user input in the pro- uments in those corpora. Do and Ng (2006) present
duction and correction of errors. Their spellcheck- an improved algorithm for the “multiclass text clas-
ing algorithms assume a completed document and a sification task” that seems to be suited to similar
cessation of user input. classification between such corpora.1
I propose a more intelligent text entry engine, in
which the errors that contribute to easily corrected References
misspellings (by minimum edit distance methods, Chuong Do and Andrew Ng. Transfer learning for text classifi-
for instance) will be incorporated into the model cation. In Y. Weiss, B. Schölkopf, and J. Platt, editors, Ad-
vances in Neural Information Processing Systems 18, pages
of edit distances. For example, if a user mistypes 299–306. MIT Press, Cambridge, MA, 2006.
“aling,” the engine will quickly autocorrect it as Daniel Jurafsky and James H. Martin. Speech and Language
“align,” but if the user then mistypes “desing,” the Processing (2nd Edition), pages 72–79. Prentice Hall, 2 edi-
engine will remember the “ng” → “gn” edit it tion, 2008. ISBN 0131873210.
performed before, and suggest “design” instead of Karen Kukich. Techniques for automatically cor-
recting words in text. ACM Comput. Surv., 24
“dewing” (my naive iPad suggests “dewing” even (4):377–439, 1992. ISSN 0360-0300. doi:
though I make that typo all the time). This might be http://doi.acm.org/10.1145/146370.146380.
more like active learning, but as I understand it, the 1
The results section is somewhat opaque to me, but overall it
difference between active learning and semisuper- seems like a reasonable place to start; it was not so easy finding
vised learning over a non-finite (continuous?) data relevant articles for this idea.

Brown Christopher Proposal
Document2 pages
Brown Christopher Proposal
Christopher Brown
No ratings yet
Behavioral Factors in Interactive Training of Text Classifiers
Document5 pages
Behavioral Factors in Interactive Training of Text Classifiers
bastian
No ratings yet
Company, Mytown, Ourcity - Unknown - Context-Aware Natural Language Generation With Recurrent Neural Networks
Document7 pages
Company, Mytown, Ourcity - Unknown - Context-Aware Natural Language Generation With Recurrent Neural Networks
Black Fox
No ratings yet
Natural Language Processing
Document8 pages
Natural Language Processing
TALISHA
No ratings yet
What Is Text, Really?: Steven J. Derose David G. Durand Elli Mylonas Allen H. Rehear
Document24 pages
What Is Text, Really?: Steven J. Derose David G. Durand Elli Mylonas Allen H. Rehear
tombuu
No ratings yet
Vietnamese Text Clasification
Document7 pages
Vietnamese Text Clasification
Nam_Nguyen_4748
No ratings yet
Distributed Computer Systems: Theory and Practice
From Everand
Distributed Computer Systems: Theory and Practice
H. S. M. Zedan
Rating: 4 out of 5 stars
4/5 (1)
Semi-Automatic Annotation System For OWL-based Semantic Search
Document5 pages
Semi-Automatic Annotation System For OWL-based Semantic Search
International Journal on Computer Science and Engineering
No ratings yet
Becker and Kuropka - Topic-Based Vector Space Model PDF
Document6 pages
Becker and Kuropka - Topic-Based Vector Space Model PDF
3619952
No ratings yet
Automatic Text Summarization Using: Hybrid Fuzzy GA-GP
Document7 pages
Automatic Text Summarization Using: Hybrid Fuzzy GA-GP
Sona Agarwal
No ratings yet
CRAB Reader: A Tool For Analysis and Visualization of Argumentative Zones in Scientific Literature
Document8 pages
CRAB Reader: A Tool For Analysis and Visualization of Argumentative Zones in Scientific Literature
Coralyn Matzen
No ratings yet
Joint Recognition of Handwritten Text and Named Entities With A Neural End-To-End Model
Document6 pages
Joint Recognition of Handwritten Text and Named Entities With A Neural End-To-End Model
fetulhak abdurahman
No ratings yet
A Corpus of Annotated Revisions For Studying Argumentative Writing
Document11 pages
A Corpus of Annotated Revisions For Studying Argumentative Writing
cutestudent
No ratings yet
Sanskrit With Lexical Analysis - Uploaded by Imran
Document7 pages
Sanskrit With Lexical Analysis - Uploaded by Imran
abhishek4u87
No ratings yet
Vade Mecum 2
Document362 pages
Vade Mecum 2
shiraj sk
No ratings yet
AI As Agency Without Intelligence
Document12 pages
AI As Agency Without Intelligence
Daniel Arturo Guerrero Álvarez
No ratings yet
Dialnet LynneBowkerComputerAidedTranslationTechnology 4925608 PDF
Document4 pages
Dialnet LynneBowkerComputerAidedTranslationTechnology 4925608 PDF
fouzi
No ratings yet
Email Filtering: Machine Learning Techniques and An Implementation For The UNIX Pine Mail System
Document42 pages
Email Filtering: Machine Learning Techniques and An Implementation For The UNIX Pine Mail System
Nyamonaa Agata
No ratings yet
Deep Learning in Natural Language Processing A State-of-the-Art Survey
Document6 pages
Deep Learning in Natural Language Processing A State-of-the-Art Survey
Smriti Medhi Moral
No ratings yet
cmmr2019 v08-dg
Document12 pages
cmmr2019 v08-dg
giavitto
No ratings yet
ConHAN: Contextualized Hierarchical Attention Networks For Authorship Identification
Document11 pages
ConHAN: Contextualized Hierarchical Attention Networks For Authorship Identification
Jouault
No ratings yet
Dual Sentiment Analysis: Considering Two Sides of One Review
Document14 pages
Dual Sentiment Analysis: Considering Two Sides of One Review
pooja
No ratings yet
One-Class Learning For AI-Generated Essay Detection
Document24 pages
One-Class Learning For AI-Generated Essay Detection
Josh Ocampo
No ratings yet
Dale, E. (1965
Document37 pages
Dale, E. (1965
hossam mohamed
No ratings yet
From User Requirements To UML Class Diagram: Hatem Herchi, Wahiba Ben Abdessalem
Document4 pages
From User Requirements To UML Class Diagram: Hatem Herchi, Wahiba Ben Abdessalem
Amanuel
No ratings yet
1 s2.0 S2352711021000364 Main
Document6 pages
1 s2.0 S2352711021000364 Main
jcage3010
No ratings yet
Language and the Rise of the Algorithm
From Everand
Language and the Rise of the Algorithm
Jeffrey M. Binder
No ratings yet
Open Classification Final Report
Document5 pages
Open Classification Final Report
Anwar Shah
No ratings yet
CS101 Study Guide
Document49 pages
CS101 Study Guide
EROMONSELE VICTOR
No ratings yet
2023 Implications of Computer Code Translation For Translation Studies
Document18 pages
2023 Implications of Computer Code Translation For Translation Studies
frank
No ratings yet
Intelligent Agents and Apache Cocoon For A CV Generation System
Document7 pages
Intelligent Agents and Apache Cocoon For A CV Generation System
Satyanarayan Gupta
No ratings yet
Semisupervised Autoencoder For Sentiment Analysis12059-55631-1-PB
Document7 pages
Semisupervised Autoencoder For Sentiment Analysis12059-55631-1-PB
PhamThi Thiet
No ratings yet
Cognitive Dimensions of Notations
Document14 pages
Cognitive Dimensions of Notations
Nic Kelman
No ratings yet
The Unreasonable Effectiveness of Data PDF
Document5 pages
The Unreasonable Effectiveness of Data PDF
YIzhi Zhang
No ratings yet
AutexTification IberLEF 2023 3 Junio
Document18 pages
AutexTification IberLEF 2023 3 Junio
Alberto Fernández Hernández
No ratings yet
Context Aware Document Embedding: Lau and Baldwin 2016 Le and Mikolov 2014
Document8 pages
Context Aware Document Embedding: Lau and Baldwin 2016 Le and Mikolov 2014
Yoann Dragneel
No ratings yet
Ontology Matching - A Machine Learning Approach
Document20 pages
Ontology Matching - A Machine Learning Approach
Davood Yousefi
No ratings yet
A Principled Consideration of Computers & Reading in A Second Language
Document26 pages
A Principled Consideration of Computers & Reading in A Second Language
api-27788847
No ratings yet
AutomaticSQLQueryFormationfromNaturalLanguageQuery PDF
Document6 pages
AutomaticSQLQueryFormationfromNaturalLanguageQuery PDF
Raman
No ratings yet
E Cient English Text Classification Using Selected Machine Learning Techniques
Document9 pages
E Cient English Text Classification Using Selected Machine Learning Techniques
SYEDALI MOHSIN
No ratings yet
Parsing
Document4 pages
Parsing
Amira EC
No ratings yet
2023 Emnlp-Demo 45
Document13 pages
2023 Emnlp-Demo 45
Nafiz Ahmed
No ratings yet
A Technique For Isolating Differences Between Files
Document5 pages
A Technique For Isolating Differences Between Files
joyceschan
100% (4)
A Comparison of Methods For The Evaluation of Text Summarization Techniques
Document8 pages
A Comparison of Methods For The Evaluation of Text Summarization Techniques
Naty Dasilva Jr.
No ratings yet
Thesis TDX
Document7 pages
Thesis TDX
PaySomeoneToWriteMyPaperSingapore
100% (1)
Predictive and Corrective Text Input For Desktop Editor Using N-Grams and Suffix Trees
Document4 pages
Predictive and Corrective Text Input For Desktop Editor Using N-Grams and Suffix Trees
anupama N
No ratings yet
Building Applied Natural Language Generation
Document32 pages
Building Applied Natural Language Generation
rat86
No ratings yet
Coop Isipcala93
Document43 pages
Coop Isipcala93
Victor Noroc
No ratings yet
Computer Science One
Document615 pages
Computer Science One
Dina Zafarraya
No ratings yet
Document Classification Using Machine Learning Algorithms - A Review
Document7 pages
Document Classification Using Machine Learning Algorithms - A Review
anagh dash
No ratings yet
Relevant in A Text Document An - Interpretab
Document19 pages
Relevant in A Text Document An - Interpretab
Jhonny Sins
No ratings yet
Evaluating Natural Language Understanding Services For Conversational Question Answering Systems
Document12 pages
Evaluating Natural Language Understanding Services For Conversational Question Answering Systems
kushagra
No ratings yet
Semantic Computing and Language Knowledge Bases
Document12 pages
Semantic Computing and Language Knowledge Bases
Andi Kazani
No ratings yet
Dictionary of Information Science and Technology
From Everand
Dictionary of Information Science and Technology
Carolyn Watters
No ratings yet
A Natural Language Interface For Crime Related Spatial 54nz1icgr3
Document4 pages
A Natural Language Interface For Crime Related Spatial 54nz1icgr3
Alvionitha Sari Agstriningtyas
No ratings yet
A Survey of Deep Learning Approaches For OCR and D
Document14 pages
A Survey of Deep Learning Approaches For OCR and D
karim dab
No ratings yet
The Fourth International Conference On Artificial Intelligence and Pattern Recognition (AIPR2017)
Document105 pages
The Fourth International Conference On Artificial Intelligence and Pattern Recognition (AIPR2017)
SDIWC Publications
No ratings yet
Toward Systematic Review Automation: A Practical Guide To Using Machine Learning Tools in Research Synthesis
Document10 pages
Toward Systematic Review Automation: A Practical Guide To Using Machine Learning Tools in Research Synthesis
Ademar Neto
No ratings yet
Arabic Words Clustering by Using K-Means Algorithm
Document5 pages
Arabic Words Clustering by Using K-Means Algorithm
Faiez Musa Lahmood Alrufaye
No ratings yet
Module 2: Goals of Parallelism Week 2 Learning Outcomes:: General-Purpose Computing On Graphics Processing Units
Document11 pages
Module 2: Goals of Parallelism Week 2 Learning Outcomes:: General-Purpose Computing On Graphics Processing Units
Sunshine Frigillana
No ratings yet
Closure in A Confederacy of Dunces
Document13 pages
Closure in A Confederacy of Dunces
Christopher Brown
No ratings yet
Mood and Modality (2nd Edition) : SIL Electronic Book Reviews 2004-010
Document3 pages
Mood and Modality (2nd Edition) : SIL Electronic Book Reviews 2004-010
Christopher Brown
No ratings yet
Scribd
Document1 page
Scribd
Christopher Brown
No ratings yet
The "Play" of Don Quixote and Sancho Panza
Document2 pages
The "Play" of Don Quixote and Sancho Panza
Christopher Brown
No ratings yet
The Law of The Land
Document8 pages
The Law of The Land
Christopher Brown
No ratings yet
The Tragedy of The Transatlantic American
Document8 pages
The Tragedy of The Transatlantic American
Christopher Brown
No ratings yet
What Is Menippean Satire?
Document2 pages
What Is Menippean Satire?
Christopher Brown
No ratings yet
In Search of Lost Teleology: Hegel's Remembrance of Things To Come
Document4 pages
In Search of Lost Teleology: Hegel's Remembrance of Things To Come
Christopher Brown
100% (1)
Mr. Ramsay, The Modern Man
Document4 pages
Mr. Ramsay, The Modern Man
Christopher Brown
100% (2)
Exuberance in The Face of The Ultimate and Inexorable Pancake
Document8 pages
Exuberance in The Face of The Ultimate and Inexorable Pancake
Christopher Brown
100% (1)
Unnaturalness in Francis Bacon's New Atlantis
Document9 pages
Unnaturalness in Francis Bacon's New Atlantis
Christopher Brown
100% (1)
Junior Poet Preparatory Notes Preceding The Final
Document5 pages
Junior Poet Preparatory Notes Preceding The Final
Christopher Brown
No ratings yet
A Postmodern Supersession of Modernistic Typology in Who's Afraid of Virginia Woolf?
Document2 pages
A Postmodern Supersession of Modernistic Typology in Who's Afraid of Virginia Woolf?
Christopher Brown
100% (2)
Darwin - Furious Emma
Document8 pages
Darwin - Furious Emma
Christopher Brown
No ratings yet
Analysis of Auden's in Memory of W. B. Yeats
Document11 pages
Analysis of Auden's in Memory of W. B. Yeats
Christopher Brown
100% (8)
Esthetics As A Simple Matter of Preference
Document6 pages
Esthetics As A Simple Matter of Preference
Christopher Brown
No ratings yet
Darwin - Un-Aristotelian Exploration
Document10 pages
Darwin - Un-Aristotelian Exploration
Christopher Brown
100% (1)
Detection of Cyber Bullying On Social Media Using Machine Learning
Document8 pages
Detection of Cyber Bullying On Social Media Using Machine Learning
IJRASETPublications
No ratings yet
Data Science Solutions Sample
Document53 pages
Data Science Solutions Sample
Rajiv
100% (1)
Kernel Methods For Pattern Analysis
Document478 pages
Kernel Methods For Pattern Analysis
Agniva Chowdhury
100% (1)
Yelp Explorers Report
Document10 pages
Yelp Explorers Report
Mariana Taglio
No ratings yet
Zavgren 1985 PDF
Document27 pages
Zavgren 1985 PDF
dr musafir
No ratings yet
Machine Learning Mastery With Weka Sample
Document20 pages
Machine Learning Mastery With Weka Sample
Sha Shaik
No ratings yet
Data Mining
Document4 pages
Data Mining
info1672
No ratings yet
Data Mining BKTI 1
Document47 pages
Data Mining BKTI 1
Silogism
No ratings yet
Malware Detection
Document29 pages
Malware Detection
Mohammed Taquee
No ratings yet
IITK Malware Problem Final PDF
Document5 pages
IITK Malware Problem Final PDF
shubham
No ratings yet
Generalization Error: Elie Kawerk
Document37 pages
Generalization Error: Elie Kawerk
manish wadhwani
No ratings yet
Reach QA Le-Jcdl16 PDF
Document10 pages
Reach QA Le-Jcdl16 PDF
codfish234
No ratings yet
Crime Hotspot Prediction Using Machine Learning v4
Document20 pages
Crime Hotspot Prediction Using Machine Learning v4
api-431963966
No ratings yet
Dunham - Data Mining
Document156 pages
Dunham - Data Mining
pop ion
100% (2)
Exposys Data Labs Diabetes Disease Prediction: Shilpa J Shetty Nishma Nayana
Document13 pages
Exposys Data Labs Diabetes Disease Prediction: Shilpa J Shetty Nishma Nayana
Dhyeaya
No ratings yet
2nd Part
Document47 pages
2nd Part
Mallela Keerthi
No ratings yet
Wheelchair Curling Classification Rules
Document9 pages
Wheelchair Curling Classification Rules
Ioan-Alexandru Codarcea
No ratings yet
Advanced Deep Learning Questions - ChatGPT
Document13 pages
Advanced Deep Learning Questions - ChatGPT
Lily Lauren
No ratings yet
Education Research International-Musso Et Al.
Document13 pages
Education Research International-Musso Et Al.
Paz Perez Calvo
No ratings yet
Machine Learning Lecture
Document332 pages
Machine Learning Lecture
Dona Davak
No ratings yet
Erdas Tutorial
Document61 pages
Erdas Tutorial
Ramona Badea
100% (1)
6.891 Machine Learning: Project Proposal
Document2 pages
6.891 Machine Learning: Project Proposal
sagar
No ratings yet
Logit Analysis
Document8 pages
Logit Analysis
lily
No ratings yet
Sentiment Analysis Using Bert On Yelp Restaurant Reviews
Document63 pages
Sentiment Analysis Using Bert On Yelp Restaurant Reviews
19131A05H9 POPURI TRIVENI
No ratings yet
Comparative Study On Spoken Language Identification Based On Deep Learning
Document5 pages
Comparative Study On Spoken Language Identification Based On Deep Learning
Maged Hamouda
No ratings yet
Big Data Analytics For Predictive Maintenance Strategies
Document26 pages
Big Data Analytics For Predictive Maintenance Strategies
Ông Già
No ratings yet
5 - Using Ecognition Developer in Quick Map Mode - Ecognition Community
Document17 pages
5 - Using Ecognition Developer in Quick Map Mode - Ecognition Community
Wanly Pereira
No ratings yet
SVM Using Python
Document24 pages
SVM Using Python
Ravindra Ambilwade
No ratings yet
OPS 5003 End-Term Question Paper
Document7 pages
OPS 5003 End-Term Question Paper
Saurabh Singh
No ratings yet