Analyzing The Sentiment of Tweets With Java | by Siddhant Dubey

[ad_1]

A primer for NLP with Java

Photo by Hitesh Choudhary on Unsplash

For my final project for an Advanced Java course I was taking at school, I decided to combine my interest in Machine Learning with Java and make a Natural Language Processing project. One of the most basic Natural Language Processing projects is analyzing the sentiment analysis of tweets on Twitter. So naturally, I went about attempting to analyze the sentiments of tweets with the words “USA Coronavirus Response.”

At the beginning of any Machine Learning project, the first thing you need to do is obtain data. To obtain the tweets you’ll need to do your analysis, you’re going to need to access the Twitter API. Python has a ton of libraries to do this, such as Tweepy. However, we’re using Java which doesn’t have as many packages as Python does, but it does have a great Twitter API package that we can use to get data: twitter4j. First, you should install the package, I recommend using Maven as your build manager for the project. So to add the twitter4j library to your java project, add this to your pom.xml life.

<dependencies>
<dependency>
<groupId>org.twitter4j</groupId>
<artifactId>twitter4j-core</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>org.twitter4j</groupId>
<artifactId>twitter4j-async</artifactId>
<version>4.0.0</version>
<scope>test</scope>
</dependency>
</dependencies>

To be able to use the API, you’re going to need to apply for a twitter developer account. Once you apply for a developer account and get your API credentials, you can start scraping the tweets you need for for this project.

Here’s the code you need to scrape the data. Make this your Main.java file.

package sample;

import twitter4j.*;
import twitter4j.conf.ConfigurationBuilder;

public class Main{
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setDebugEnabled(true)
.setOAuthConsumerKey("Your Consumer Key") .setOAuthConsumerSecret("Your Consumer Secret")
.setOAuthAccessToken("Your Access Token") .setOAuthAccessTokenSecret("Your Access Token Secret");
TwitterFactory tf = new TwitterFactory(cb.build());
Twitter twitter = tf.getInstance();
Query query = new Query("USA Coronavirus Response");
query.setCount(100);
QueryResult result = twitter.search(query);
}

This section of code will get 100 tweets, which is the max the twitter4j library allows you to obtain. There are ways around this, but for the purposes of a demonstration, this should work! After you run this code, 100 tweets that have the term “USA Coronavirus Response” will be stored as the result variable. Now we’ll move on to building the SentimentAnalyzer class.

To build our Sentiment Analyzer class, we will use the Standford coreNLP package. To add this package to your project, add the following dependency to your pom.xml file.

<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>4.0.0</version>
<classifier>models</classifier>
</dependency>

The code itself is not that complex and is as follows. Be sure to include this in a separate file from Main.java and call the file you contain it in SentimentAnalyzer.java.

package sample;
import java.util.Properties;
import org.ejml.simple.SimpleMatrix;
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.neural.rnn.RNNCoreAnnotations;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.sentiment.SentimentCoreAnnotations;
import edu.stanford.nlp.sentiment.SentimentCoreAnnotations.SentimentAnnotatedTree;
import edu.stanford.nlp.trees.Tree;
import edu.stanford.nlp.util.CoreMap;

public class SentimentAnalyzer {
static StanfordCoreNLP pipeline;


public static void init() {
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, parse, sentiment");
pipeline = new StanfordCoreNLP(props);
}

public static void main(String[] args) {
init();
}


public static String findSentiment(String tweet) {

int mainSentiment = 0;
String sentimentType = "NULL";
if (tweet != null && tweet.length() > 0) {
int longest = 0;
Annotation annotation = pipeline.process(tweet);
for (CoreMap sentence : annotation
.get(CoreAnnotations.SentencesAnnotation.class)) {
Tree tree = sentence
.get(SentimentAnnotatedTree.class);
int sentiment = RNNCoreAnnotations.getPredictedClass(tree);
sentimentType = sentence.get(SentimentCoreAnnotations.SentimentClass.class);
}
}
return sentimentType;
//sentiment ranges from very negative, negative, neutral, positive, very positive
}

}

Now that our Sentiment Analyzer class is done, we can go and build the complete project by putting it all together.

We’re going to be putting it all together in the Main.java file. Here’s the code:

package sample;

import twitter4j.*;
import twitter4j.conf.ConfigurationBuilder;
import static sample.SentimentAnalyzer.findSentiment;
import static sample.SentimentAnalyzer.init;

public class Main{
SentimentAnalyzer.init();
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setDebugEnabled(true)
.setOAuthConsumerKey("Your Consumer Key") .setOAuthConsumerSecret("Your Consumer Secret")
.setOAuthAccessToken("Your Access Token") .setOAuthAccessTokenSecret("Your Access Token Secret");
TwitterFactory tf = new TwitterFactory(cb.build());
Twitter twitter = tf.getInstance();
Query query = new Query("USA Coronavirus Response");
query.setCount(100);
QueryResult result = twitter.search(query);
//The following lines of code initialize variables that represent the number of tweets of each sentiment.

int num_neutral = 0;
int num_negative = 0;
int num_realnegative = 0;
int num_positive = 0;
int num_realpositive = 0;

for(int i=0; i<result.getTweets().size(); i++){
String tweet = result.getTweets().get(i).getText();
String sentiment = findSentiment(tweet);
if(sentiment.equalsIgnoreCase("Neutral")){
num_neutral++;
}
else if(sentiment.equalsIgnoreCase("Negative")){
num_negative++;
}
else if(sentiment.equalsIgnoreCase("Very Negative")){
num_realnegative++;
}
else if(sentiment.equalsIgnoreCase("Very Positive")){
num_realpositive++;
}
else{
num_positive++;
}
}

System.out.println(num_neutral);
System.out.println(num_negative);
System.out.println(num_realnegative);
System.out.println(num_realpositive);
System.out.println(num_realnegative);
}

All this code does is loop through each tweet that you obtained, classify the sentiment, and then increment the number of tweets of that sentiment. At the end, it spits out the number of tweets that are of each sentiment. If you want to expand on this, you can try making a GUI with JavaFX and adding graphs to visualize the data. Hopefully, you enjoyed this brief introduction to NLP with Java. There are a lot more cool things you can do with it and I hope you make something interesting!



[ad_2]

Leave a Comment