Tuesday, September 9, 2014

Tweet Sentiment Analysis

Hey there,

It’s been a while since we last posted but we are pretty excited about our new sentiment analysis transforms so I thought I would make a quick post about it.

Sentiment analysis can be described as the use of natural language processing (NLP) to extract the attitude/opinion of a writer towards a specific topic. With the overwhelming amount of data being posted on the Internet every day and no way to read it all, sentiment analysis has become a really useful tool for extracting and aggregating opinions on a specific topic from many different sources. The potential use for sentiment analysis is endless, a few examples are things like brand reputation monitoring, market research, stock-exchange monitoring, etc. The transform that we built takes a Tweet as its input entity and returns either positive, neutral or negative entity. This way a large amount of Tweets can easily be categorized according to their sentiment.

Although sentiment extraction is a relatively new area of research there are quite a few methods of going about it and a lot of companies offering different sentiment analysis APIs. With many APIs to choose from it was quite difficult to decide which one would work best for the transform. I decided to use my top four APIs, aggregated their result and use that as the output of my transform. The problem with this method was that most of the time the APIs would return different and often obviously incorrect results (I won’t mention any names). While some APIs seemed to work well on certain topics of Tweets they would fail horribly on others. After much experimentation I settled for using only AlchemyAPI’s sentiment analysis tool which seems to work the best out of all the APIs that I tested, and I tested quite a few so well done to them.

I then built a new machine named Twitter Analyser to use with the new sentiment analysis transform. This machine takes a phrase in as its input and searches Twitter for Tweets with this phrase. From the Tweets that are returned hash tags, links, sentiment and uncommon words found in the Tweets are extracted. The uncommon words are extracted with one of our other new transforms that checks the word against an ordered list of common words, if the input word does not occur in the list before a certain threshold the word is returned as an entity.  The To Words transform can takes in two transform settings: the threshold it must search in the list of common words and words that should be ignored by the transform. The machine will run every 5 minutes to search Twitter for new Tweets. Running the machine in bubble view it is easy to see common hashtags, links, words and sentiment between Tweets of a certain topic. The screenshot of the graph below shows an example of using the Tweet Analyser machine on the phrase AlchemyAPI:


In this image the entities are sized according to their number of incoming links so you can see what is common between many Tweets.  From the image you can see common hashtags like: #ai, #deeplearning and #sentimentanalysis as well as pick out the common links and words between the Tweets.

As always enjoy responsibly!
Paul

PS: As most of you already know we have recently released an update to Maltego [version 3.5.2], our YouTube video here gives a quick breakdown of the new features:  https://www.youtube.com/watch?v=QK6PX4Fq5xY&list=UUThOLpqhLFFQN0nStdkyGLg

Tuesday, April 1, 2014

Maltego Carbon - now!

Hi there,

It took a while to get there (and this release was tricky to schedule) but we're finally ready with Maltego Carbon. As this is a major release you'll need to download a fresh copy from the website  - it does not update from Tungsten.

Major new features are:

  1. OAuth - means you can use Twitter transforms again!
  2. New and improved tabular import - create a graph from XLS/CSV  
  3. Discovery of Maltego configs from NTDS (much needed feature for those with their own servers) as part of discovery.
  4. Bug fixes, optimization and new looks. 

Andrew explains of this in the next episode of AndrewTV (click below for this exciting episode):


For server clients - you can download new updated servers from the server portal:

  1. NTDS - now with paired configuration functionality and generic OAuth support. You can simply backup your configuration and load it into the new VM.
  2. CTAS - with new Twitter transforms and fancy *_SE transforms.

Maltego Carbon is live for download from the [Paterva website] right NOW. As always the free community edition of Carbon will follow in a few weeks.

Enjoy responsibly!!!
RT






Thursday, March 20, 2014

Where is my Maltego release?

Good question. We said we'll release this week and we haven't. Actually - truthfully, we said we'll release LAST week and we didn't. There are good reasons for this delay.

The updater broke -don't ask why.... Which means we can't send the new release as an update. Which in turns means we should really have a major new version and new builds - and a new element name. And that's OK, because we've added a whole lot of cool new features into this release. But - it also means that we need to do a bit of rebranding and have a few more edges to smooth out. And all of this takes time. Additionally it's a short week in SA (Human Rights day tomorrow), we finally moved into our Cape Town offices (and had to sort out Internet access) and nobody pushes out a new release on a Friday.

We CAN tell you that we're trying our best to have Maltego CARBON out before the end of the month. Yeah I know...don't even mention it. But you know that it's close when we're deciding on splash screens and testing angles for the associated video.

We'll leave you with the shortlist of Maltego Carbon splash screens:




Monday, January 20, 2014

reCaptcha: Stop spam...and everyone else.

Hey guys,

About a week ago we noticed that the images we were getting from reCaptcha were near impossible to solve. And wow did people let us know, we've had emails ranging from just strings of cuss words to people who refuse to touch anything we do after it.

Unfortunately as we were just passing on an image we got from the google reCaptcha service ( http://www.google.com/recaptcha )  it wasn't really in our control. We have seen them go through bouts of absolute madness, but usually only for a day or so. We decided to leave it for a week and hoped it would get better. It sadly didn't.

At this stage we are no longer verifying reCaptcha images (you can put in anything and it will validate) while we explore other options.

We apologise for the inconvenience, frustration and mental instability trying to solve those impossible CAPTCHAs were.

Monkeys gone to heaven.
-AM

Monday, December 23, 2013

Christmas special. Another useful Paterva tradition.

All,

It's that time of the year again. When going grocery shopping feels like scene from The Walking Dead. Human meat density coefficient just waaay to high. Going to the office feels like a scene from The Walking Dead (but another episode). Not fun either way.

Since we moved to Cape Town about 3 weeks ago we still don't have formal offices and as such I am working from home (Andrew is back up in Gauteng spending time with family). I suspect many of you are doing the same - still very much connected, online and reading those emails marked as 'I'll get back to that when I am not so busy'.  Pro tip - start with 'Sorry for not getting back to you earlier'. It's already weird. Just face it.

Every year we run a Christmas special. And it's really a special - not a silly 10% off. In 2011 it was 50% off. In 2012 it was 33% off. This year -2013 - it's 50% again. We're all for consistency.  That means you get the commercial version at $380. Remember - the community edition is still FREE. We know that at $380 it's hardly an impulse buy - so you want to get your boss/FD on the phone (if he is not relaxing next to a pool with a G&T) and convince him that this REALLY only happens once a year!

The special is aimed at the people that are still hard at work at this time of the year. It's your reward for being the backbone of the 'skeleton staff'.

Enjoy the festive season. Show restraint when it comes to family matters. Embrace social awkwardness. If you're reading this you're probably the weird uncle...

Oh. Ya. The coupon. It's "GiveMaltegoAsAGift". Try it as a gift. Your wife / husband / girlfriend / boyfriend / dog / cat / pet ferret will enjoy our technology!

Note: Offer runs from today to the 26th of Dec.

Friday, November 8, 2013

Maltego CaseFile v2 released!

Maltego CaseFile version 1 was really cute. You could draw pretty pictures with it and show it to your friends. We even made a Game of Thrones graph with it - because we had friends that did not read the books and only started watching it at Season 3. And we were tired of explaining all the intricate relationships to them.

We didn't give CaseFile a lot of attention. In a way it was like washing your elbows. You wash your face and under your arms and so on.. but you don't actively think about washing your elbows. It's not like you would tell your child "Hey Pietie - make sure you wash your elbows tonight OK?". And so it was with us and CaseFile.

We released Maltego Tungsten at BlackHat USA in August this year. For the next couple of months CaseFile would sit next the Tungsten on the website. It would look at Tungsten with envy. Tungsten was shiny and new. It had nice shoes and a pretty dress. It was all grown up. CaseFile was left behind and it seemed that nobody at Paterva cared. CaseFile cried a little bit.

But then something wonderful happened! The developers picked CaseFile up from the shelf. They dusted her off. They gave her new shoes and a pretty dress - as pretty as Tungsten's. They gave her a complete makeover and called her CaseFile version 2.0. They loved her again - they gave her a new splash page and made a cool new video with her name in lights and they talked about her all day long:



CaseFile was happy again. And you could be too.
Download a fresh copy now from our website (www.paterva.com).


Thursday, October 24, 2013

Andrew makes a blog entry! Also the story of KingPhisher!

Hi Interwebtonians,

It has been absolutely ages since I have written a blog post - and its not from the lack of prodding from Roelof. We have genuinely just been busy!

Predominately I want to show you some of the work we had to do for Blackhat 2013 - my first BH talk ever! My section of the work was what we ended up calling 'KingPhisher' as well as the multi-threaded Python script to crawl websites for some parts of 'Teeth' (Roelof's offensive Maltego transforms).

<TL;DR>
    Video: [HERE]
    Download: [HERE]
</TL;DR>

A common Paterva office treat is that if you make a mistake or if the other person can catch you out at anything you have to make tea (the amount of times I make tea is inversely proportional to how long I have been at Paterva!). This included phishing. Many years ago we would try trick each other into clicking on links. Most security people will agree with us when we say that if you have enough context on a person you can craft an email and include a link on which they *will* click. Additionally we have used Maltego to gain context on people for a while, specifically using social networks (including transforms provided commercially via the SocialNet package). We also accept that there are certain types of mail we seldomly check (in terms of headers/other), we have been semi-programmed by automatic spam filtering and anti-virus to notify us if something is bad. Bottom line -- we don't inspect every link on every mail and we doubt if you do too.

So with this in mind we decided to integrate the two sides - 1) targeted phishing attacks and 2) information gathering in Maltego.

The first _really_ exciting part for me is that we took the first steps towards protocol 3, what's known as graph in/graph out. In this case it was just sending the graph out, but it meant that we could finally receive context on the entities sent to transforms! It uses the new 'Send to URL' transform that POSTs the graph data in XML to a specific script (e.g. http://zer0cool.tld/graphin.php). This script then returns a URL to Maltego which in turn starts a browser with that page. What this gives you is the ability to do customised exporting of data for things like viewing graphs online, reporting or doing additional data mining based on context (NOTE: There is a limit of 50 entities for this 'transform').

Please note I have added this transform to a set so that I dont need to go find it.
(Sets can be managed under the manage tab->Manage transforms)
The first section tackled was the Maltego side of things which has been done before. You can give it a go yourself within the tool or watch our videos. Having context on the graph means you can do something like Person->Email->social network membership. It means you know a) the a persons name, b) you know their email address and you know they use it for social networking and c) you know what their social network profile is. From the social network you can mine for particular types of information that you can leverage for the phishing attack.

In the above example we see that andrew@punks.co.za relates to my Facebook account and that I use andrewmohawk@gmail.com for my Twitter account with an Alias 'AndrewMohawk'.

This takes us to the second part - the KingPhisher web application. This web application is made up of the following sections:
  • The 'receiver' accepts the POST of the graph from Maltego and stores it in a local sqlite database, then returns a URL to Maltego which is automatically opened.
  • The 'wizard'/interface. This is the wizard/interface that will be used to craft templates based on information available in the graph.
  • The 'sender'. This is merely a PHP SMTP script that you can move around to send the actual mail. It ensures you can keep the wizard/main interface separate from the machine you send mail from.
  • The 'catchers'. These are fake websites used to attempt to capture credentials (where needed).

The receiver parses all the XML and works out what is connected into 'trees' that compromise of a parent and N children - where at least one of the entities, either parent or child - is an email address.

Two 'trees' shown from the previous graph


The Wizard will look at the 'trees' and figure out which templates are available for use. As an example, if a tree has a Facebook profile we can use a Facebook template as well as generic ones that don't require additional context and if it had a Twitter account we could use a template relating to Twitter as well as the generic templates.

Once you have selected a particular tree and a template you can then configure it. Each template has one standard configuration option that determines how the link would behave. The options are:
  1. Clean redirect - simply changes the link to a location you have selected.
  2. Bounce redirect - changes the link to a KingPhisher 'catcher' which once browsed to will redirect the target to a user selectable location. It will also capture and store the user agent and IP address.
  3. Collect - This will redirect to a catcher that will look like a legitimate website. It also captures the user agent and IP address as well as any credentials entered into the fake website.  In future these sites could/should be made a little more intelligent by only serving sites if the target is coming from the correct IP range or serving different websites based on the user agent. 
The wizard screens are shown below:


The templates available to this email address based on the context

The template settings for the Twitter template where the "fromProfile" field
has been entered by the attacker



The rendered version of the template ready to be sent out


Once the templates have been selected and configured they can be viewed and saved. When everything is fine tuned the emails can be sent out to the targets. The sending process is routed via the 'sender' script which can either live on the same machine as the interface/wizard or anywhere else on the Internet.

Getting templates into the actual mailboxes without them hitting spam filters proved particularly difficult as there were 3 main things that common email providers seemed to look for:
  1. SPF/DKIM for the domain you were using for the spoof address - this means no email from *.facebook.com, *.twitter.com etc.
  2. The DOM markup of each template (if it was too similar to the original one it was flagged) -- so no stealing of templates.
  3. Particular phrases within templates - this was probably the trickiest to get around as often it was strings like company address or name. It took a few runs to get it right!

Once we had got around these (you can see the email addresses and templates we use in the code) the mails were delivered to the inboxes of our targets (in this case my Gmail account):

The newly received Twitter email


The opened email in my mailbox (not flagged as spam and from "Twitter")


The fake Twitter site

After this process has been completed the attackers can then sit back and enjoy watching their Maltego machine run. The machine will query the KingPhisher server for campaigns (emails sent out), then retrieve those email addresses and any additional information (UA/IP for 'bounce' type links and the posted fields/other collected data for the 'collect' type links).

The sequence of transforms in the machine are shown below:

At this stage the user has not entered any details into the fake site,
merely opened it and his/her UA and IP are collect


The users details entered into the fake site.


To get KingPhisher you can go to http://www.paterva.com/BlackhatUSA2013/ and download the ZIP package. Inside the ZIP are a number of documents relating to installation as well extending the interface, creating templates and so on. Have fun!

So long and thanks for all the shoes!
-AM