Wednesday, February 11, 2015

Building your own LovelyHorse monitoring system with Maltego (even the free version) - it's easy!

Someone linked me to the [LovelyHorse] thingy. If you missed it - it's basically a GCHQ NSA document that was leaked containing a list of a few security related Twitter accounts that the GCHQ NSA was supposedly monitoring. Seeing that, since the last release, we have some interesting Twitter functionality in Maltego, I figured it be interesting to see how we can replicate their work.

First - manually

Before even starting with Maltego I first spent some time thinking about what I really wanted from this and did it all by hand (still in Maltego, but before we start to automate the process). As a start I'd need to get the people's Twitter handles. Well that's easy - the document lists them all. In Maltego I can start with an alias and run the transform 'AliasToTwitterUser' to get the actual Twitter handle:


I want to get the Tweets that the people wrote. There's a transform for that too - 'To Tweets [that this person wrote]'.


OK great - now I have the last 12 Tweets (my slider was set to 12). What information can I extract from the Tweet itself - keeping in mind that I want to end up doing this across 36 different handles? Well - possibly extract any hashtag, any URL mentioned in the Tweet and any other Twitter user's handle. There are transforms for all those.


Running those on a single Tweet we get something like this:


Note how the http://t.co links are nicely resolved. This Tweet didn't contain any other aliases - so you only see the hashtags and the URLs.

If we select all the Tweets and run the 3 transforms across them we see that there are some matches - in this case on hashtags 'infosec and malware':


Now, this is not really interesting at all but it's a starting point. When I do the same for the last 12 Tweets of all of the lovely horses (as I'll call this group of Twitter handles) I might see some pattern.

Now - all the horses

I copy the text from the PDF and paste it into a text editor - Notepad will do. Clean it up a bit and we have:

Select all and paste into Maltego. It will result in every line being mapped as a phrase. Select all the entities (control A) and change the type to 'Alias' and run the 'AliasToTwitterUser' transform on all the phrases - like we did at the beginning, except now we're doing it on all the aliases. It should look something like this:

At this stage I can get rid of the Aliases because I am not going to use it anymore. I click on 'Select by Type' on the ribbon (Investigate) and select 'Alias'. Delete - and they're gone. I do a re-layout, select them all and run the 'To Tweets [that this person wrote]' - this time on all of them. Essentially I am repeating the entire process we did - but this time on all the lovely horses.

When the transforms complete the graph now looks like this:


All that's left is to run the 3 transforms (Pull URL/hashtag/alias) on all the Tweets. To select all the Tweets quickly I use 'Select by type' - Twit again. This takes a while to complete...but when Maltego has pulled out all the hashtags, URLs and aliases from the last 12 Tweets of all the lovely horses it looks like this:

No doubt this looks like ass. It's because the block layout is not really suited for this type of graph. But click on Bubble View:


and you get:


Let's get real

I wont lie - I've been spoon feeding up to now. Let's stop now - else this blog post is going to morph into a book. I am going to assume that you have a bit of Maltego experience under the belt by now. 

The way we've been doing up to now is really not terribly interesting or accurate. We're getting the last 12 Tweets. What we really want is all the Tweets in the last X seconds. Imagine that one horse hasn't been on Twitter in 14 days - then matching his/her Tweets to what's happening right now does not make a lot sense (in a monitoring scenario). We need to introduce the idea of a sliding time window. The Twitter transforms did not support that.

Didn't. Does now. Well - the 'To Tweets [that this person wrote]' does now. I hacked it quickly. Anton will not approve...but it works as it says on the tin. I've added a transform setting called 'Window' - by default 0 but when changed implements this 'in the last X seconds'. 

Now it becomes interesting when combined with machines (scripted transforms) - especially perpetual machines.  Consider the following machine:

machine("axeaxe.LovelyHorse", 
    displayName:"LovelyHorse", 
    author:"RT",
    description: "Simulates the GCHQ's LH program") {

    onTimer(240) {
        type("maltego.affiliation.Twitter",scope:"global")
        run("paterva.v2.twitter.tweets.from",slider:15,"window":"1800")
        paths{
            run("paterva.v2.pullAliases")
            run("paterva.v2.pullHashTags")
            run("paterva.v2.pullURLs")
            run("paterva.v2.TweetToWords")
        }

//half hour + half hour = one hour
        //the entities will be deleted if older than half hour
        //but the transforms time frame adds another half hour
        age(moreThan:1800, scope:"global")
        type("maltego.Twit")
        delete()
        
        age(moreThan:1800, scope:"global")
        incoming(0)
        outgoing(0)
        delete()
    }
}

Let's take a look. We run our sequence every 6 minutes ( 4 x 60s = 240s). We grab all the Twitter handles and get the Tweets - but 15 in total and only if it was written in the last half an hour (30 x 60 = 1800s - we set the window parameter to 1800. After this it's plain sailing - we get the Aliases, hashtags and URLs. 

At some stage we need to get rid of old Tweets - else our graph will just grow and grow and grow. So there's a little logic to delete nodes when they're older than half an hour. This  means that at any stage we have a one hour view on the activity of the horses. One hour - because on the limit the initial transform can contain a Tweet that's 30 minutes old and it will stay on the graph for another 30 minutes. 

The resulting graph will show us when they are Tweeting the same keyword (courtesy  of the 'TweetToWords' transform), hashtag, mention the same website or mention the same Twitter handle in their Tweets. And if they are not active on Twitter - then the graph won't contain outdated info. 

Of course - the values can be changed depending on how closely you want to monitor the situation - if the resolution is a day then the values should be (24 x 60 x 60) /2 and you should 1) up the number of Tweets returned in the slider value (as TheGrugq Tweets waaay more than 15 Tweets in a day) and 2) you shouldn't have to poll every 4 minutes.

Advanced
Right - so what we REALLY want is something that can tell us when we more than X horse's Tweets are linking to the same thing (be that a website/hashtag/whatever). For that we can't just use the 'incoming()' filter because one person could be sending ten Tweets mentioning the same website and it would mean that the website has ten incoming links. No - it has to have unique starting nodes (the horses).

We have that filter. It's called 'rootAncestorCount()'. So now - with a combo of bookmarks and this filter hackery we build something like:

        //if an entity links to moreThan 2 horses & 
        //we haven't seen it before  - mail
        incoming(moreThan:1, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmarked(1,invert:true)
        run("paterva.v2.sendEmailFromEntity",EmailAddress:"roelof@paterva.com",EmailMessage:"Multiple horses mentioned: !value!",EmailSubject:"Horse Alert")
       
        //this is to ensure we don't email over & over
        incoming(moreThan:1, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmark(1)


Basically what happens here is that we check for all entities with more than one incoming link (this can only be hashtags/URLs/words/aliases) and find the ones that have more than 2 unique grandparents (e.g. horses). If we find them, and we haven't seen them before (this Boolean flag is implemented with a bookmark) we mail the value out. We do the mailing with a transform that we wrote (and for obvious reasons cannot make public else it will be used for spam). It's not rocket science tho.

Such a machine can run for days...resultant graph for today (it's almost midnight), when configured with a one day window looks like this:


Highlighted with entire path here is the hashtag 'security' - no surprise here. The other one was the alias DaveAitel (again not suprising). Below is the email received. Remember that we'll only receive email ONCE per alert, that it's only when 2 or more horses links to it and only if it happened within a day.



The complete machine looks like this (please change values as needed - speed / resolution /etc):

machine("axeaxe.LovelyHorse", 
    displayName:"LovelyHorse", 
    author:"RT",
    description: "Simulates the GCHQ's LH program") {

    onTimer(600) {
        //find Twitter handles on graph
        type("maltego.affiliation.Twitter",scope:"global")
        
        //run to Tweets transform
        run("paterva.v2.twitter.tweets.from",slider:30,"window":"43200")
        
        //extract Alias/Hashtags/URLs and uncommon words
        paths{
            run("paterva.v2.pullAliases")
            run("paterva.v2.pullHashTags")
            run("paterva.v2.pullURLs")
            run("paterva.v2.TweetToWords")
        }

        //if an entity links to more than 2 unique horses & 
        //we haven't seen it before  - mail it out
        //comment this entire section if you don't have a mailing transform

        incoming(moreThan:1, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmarked(1,invert:true)
        
run("paterva.v2.sendEmailFromEntity",EmailAddress:"roelof@paterva.com",EmailMessage:"More than 2 horses mentioned: !value!",EmailSubject:"Horse Alert")       
        
        //this is to ensure we don't email over & over
        incoming(moreThan:2, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmark(1)
        
        
        //delete nodes when they grow old
//half hour + half hour = one hour
        //the entities will be deleted if older than half hour
        //but the transforms time frame adds another half hour
        age(moreThan:43200, scope:"global")
        type("maltego.Twit")
        delete()
        
        age(moreThan:43200, scope:"global")
        incoming(0)
        outgoing(0)
        delete()
    }
}

I hope you've enjoyed this (waaaaay too long) blog post on how our thinking goes. Of course - you get a lot more understanding of these things if you do it yourself. All of the above functionality exists in the (free) community edition of Maltego too - although there you probably want to monitor shorter intervals (say 15 minutes) as you can only display 12 Tweets per person. All in all - that's probably better.. ;)

'laters,
RT

PS: for more information on machines check out our newly built dev portal at [http://dev.paterva.com/developer]. The machine syntax etc. is located in 'Advanced'. 

And also - we made a video some time ago that shows the same kind of principle - it's [ here ]

Tuesday, February 3, 2015

Calling all transform writers! and Maltego Chlorine details! and MORE!

All,

In a few weeks we'll be releasing a new version of Maltego. We're calling it Maltego CHLORINE! (we're sure the malware analyst will love the name as ...you know...chlorine..germs...bugs....that.)



There are SO many new things. Where to start...??..But - let's start at 1).

1. New context menu
We've totally redesigned the context menu. The main reason for this is that it was getting a bit cumbersome / fat / lazy / had too many Doritos. If you had a lot of transforms you had to really know your way around the GUI to find them all. We took some time, looked at what users mainly use and designed this:

After some weeks of tweaking the design it ended up looking like this in the GUI:

We think it rocks and you will too. YOU WILL LIKE IT! and if you don't YOU WILL LEARN TO LIKE IT! Like cauliflower and Brussels sprouts. No actually, if we're serious, it's a vast improvement from the previous context menu.

2. Java 8 support.
Yeah - eventually. The reason this took a while is because we had to do an end-to-end test of Maltego on the new platform before we're confident that we can release it. Like a new girlfriend every Java version has it's own unique quirks. Things that worked perfectly well in Java 7 needs a lot more TLC in Java 8. 

3. Better OSX support
Everybody knows how much we love Macs (cough). We've decided to make the Mac install and startup a lot more robust - and easier for end users. Keep in mind that with 3 versions of Java floating around (6,7 and 8) and Windows / Linux / Mac support it's not always so easy to make sure Maltego installs and run perfectly in all 9 environments. Not to talk about the small differences between Mavericks and Yosemite, Windows 7 vs Windows 8 and easing the install on Flubber Linux 8.15.221. 

...but the most exciting feature...

4. Transform Hub
When we started Maltego almost 8 years ago our vision was always that other people can build their own transforms. As Maltego became more mature this dream was becoming a reality and today many interesting project are using Maltego as it's front end. The problem was that sharing these transforms with the rest of the world was a bit... tedious. This is why we decided to build functionality that allows you to see which other cool transforms other people made available. We call it the Transform Hub.

Note that we don't call this the Transform Store - but I guess we could have. It's basically the same thing with the exception that we hope most of the transforms will be free. It's up to all the 3rd party transform writers to decide if / how they want to price transforms. 

It basically means that when you start Maltego you'll get a list of 3rd party transforms and you get to choose which ones you want to use. Here's what it's going to look like (note that items in the Transform Hub hasn't been finalized!):

At startup you'll see the Transform Hub.


 If you want to quickly see what the transforms are all about you can just mouse-over on them.


If you want to see more details - click on 'details' (so original). Here you can see things like the transform writer's web page, if it's commercial or not, where to register (if you need to) and where you can contact the transform writers. 

Once you're ready to install the transforms simply click on 'Install'.  


And that's pretty much it. No more seed URLs to enter. With the new TDS we push entity definitions to the client during install as well! One click install. <in a very soft/low voice/and spoken very quickly: "This applies for TDS transforms only">. Yeah of course.

With this fantastic development in place we call all transforms writers! Let us know what you've been brewing up and we'll add you to ..<cave reverb, lots of echo> THE TRANSFORM HUB..Hub...hub...ub...b.

One more thing - you can always add your private server to the list too. And - if the transforms you are using are hosted on a 3rd party's own TDS server your traffic will only go this 3rd party - we don't see it! 

If you are interested in getting your own TDS - please let us know. We're keen to sell you one. And you can sell transforms. And you can sell your transforms on a TDS server to other people. Together we can make lots of money! (sorry, marketing INSISTED on this paragraph, they're not the most creative bunch. In the long run we actually see a lot of free stuff on there. We hope. Actually - it's in your hands really.)

PS - so... the public TDS is not 100% there in terms of all the bells of whistles of the shiny new commercial TDS - but Andrew promised us that he will be making tea for everyone every 3rd day until he's done porting the commercial TDS to the public TDS.

5. Development portal & forum
Because of 4) we decided it would be a great idea if people actually knew how to build their own transforms! In the past we've been...well...not so great at that - we confess. But this has all changed! 

We spent days and weeks building a really snazzy looking development website. It's at [http://dev.paterva.com/developer/] and it's packed with all sorts of nice goodies. We're still working on it so not *all* the sections are 100% completed but it should be a great resource for people wanting to write transforms. 

And also - the forum is back. Well - the development forum. Until the spammers list their shit on there again. Then we'll put our famous Maltego Community Edition CAPTCHAS on there!

The Chlorine release should be ready to go by the middle of February 2015 - we're really looking forward to seeing the feedback from our users. We'll start off (as always) with the commercial release and the community release (and the Kali Linux release) will follow soon afterwards.

Thanks for reading all of this - wow - it's a lot - we've been damn busy!

Baby seals,
RT

Monday, December 22, 2014

Keeping with tradition

Yup - it's that time of the year again. Another year is almost done. For many of us at Paterva, 2014 was a really really tough year. But this post is not about us - it's about you. No well.. not really. It's about this time of the year and our special discount we run for a few days.

It's the time when your extended family comes to visit with their 17 snotty nosed kids, the time when your weird uncle pitches up unannounced wearing only underpants and when your wife/husband/girlfriend/boyfriend/dog/cat/budgie/parrot is as stressed-out as you are. Perhaps us IT people are not *really* that much of herd animals after all.

This is why we run the Maltego Christmas special every year. This year we're giving 44% discount on licenses - from today up to the end of the 25th. It will allow you to retreat back to your computer during these days and make pretty graphs. When someone calls you for another family photo you can show them the graph (we suggest full screen mode) and say "Can you see I am busy with important stuff?!"

The coupon code this year is 'IHateSocks'. It's not like we really hate socks...but rather buy your loved one (or yourself) a freshly computed Maltego license.

Peace / love / baby seals,
RT


Monday, October 13, 2014

Announcing Maltego Carbon Community edition

Hey there people of the Internet..

Woot! We’re excited to announce that Maltego Carbon has finally has come to the masses. We've chosen a 1984 Russian Nuclear Expo theme for the splash page:



We stealthily uploaded the binaries to the website on Friday – but we wanted to wait with the announcement until today (as it’s silly to do a “press release” on a Friday – right?). This is a major new version of Maltego so you’ll need to get a new installer from our website. Simply click on the download section of the website to get yours now! [LINK]

The Carbon release is a major step up from the previous version (Tungsten). The major improvements are discussed in this video:

Killer feature – integrated OAUTH allow you to have your Twitter transforms back and a slick new import from CSV and XLS files. 

An online update of Carbon is included in the download – taking it to 3.5.3 and has a lot of new transform goodies and fixes -it's discussed in this video post:



Oh – last thing – don’t get too annoyed with the CAPTCHAS. We've know you need a doctorate degree in hieroglyphics to solve them – and as such – we've disabled it for now. You may just enter ‘Paterva rocks’ and it will register a pass. Truth be told….you can enter anything. 

Enjoy!
RT

Tuesday, September 9, 2014

Tweet Sentiment Analysis

Hey there,

It’s been a while since we last posted but we are pretty excited about our new sentiment analysis transforms so I thought I would make a quick post about it.

Sentiment analysis can be described as the use of natural language processing (NLP) to extract the attitude/opinion of a writer towards a specific topic. With the overwhelming amount of data being posted on the Internet every day and no way to read it all, sentiment analysis has become a really useful tool for extracting and aggregating opinions on a specific topic from many different sources. The potential use for sentiment analysis is endless, a few examples are things like brand reputation monitoring, market research, stock-exchange monitoring, etc. The transform that we built takes a Tweet as its input entity and returns either positive, neutral or negative entity. This way a large amount of Tweets can easily be categorized according to their sentiment.

Although sentiment extraction is a relatively new area of research there are quite a few methods of going about it and a lot of companies offering different sentiment analysis APIs. With many APIs to choose from it was quite difficult to decide which one would work best for the transform. I decided to use my top four APIs, aggregated their result and use that as the output of my transform. The problem with this method was that most of the time the APIs would return different and often obviously incorrect results (I won’t mention any names). While some APIs seemed to work well on certain topics of Tweets they would fail horribly on others. After much experimentation I settled for using only AlchemyAPI’s sentiment analysis tool which seems to work the best out of all the APIs that I tested, and I tested quite a few so well done to them.

I then built a new machine named Twitter Analyser to use with the new sentiment analysis transform. This machine takes a phrase in as its input and searches Twitter for Tweets with this phrase. From the Tweets that are returned hash tags, links, sentiment and uncommon words found in the Tweets are extracted. The uncommon words are extracted with one of our other new transforms that checks the word against an ordered list of common words, if the input word does not occur in the list before a certain threshold the word is returned as an entity.  The To Words transform can takes in two transform settings: the threshold it must search in the list of common words and words that should be ignored by the transform. The machine will run every 5 minutes to search Twitter for new Tweets. Running the machine in bubble view it is easy to see common hashtags, links, words and sentiment between Tweets of a certain topic. The screenshot of the graph below shows an example of using the Tweet Analyser machine on the phrase AlchemyAPI:


In this image the entities are sized according to their number of incoming links so you can see what is common between many Tweets.  From the image you can see common hashtags like: #ai, #deeplearning and #sentimentanalysis as well as pick out the common links and words between the Tweets.

As always enjoy responsibly!
Paul

PS: As most of you already know we have recently released an update to Maltego [version 3.5.2], our YouTube video here gives a quick breakdown of the new features:  https://www.youtube.com/watch?v=QK6PX4Fq5xY&list=UUThOLpqhLFFQN0nStdkyGLg

Tuesday, April 1, 2014

Maltego Carbon - now!

Hi there,

It took a while to get there (and this release was tricky to schedule) but we're finally ready with Maltego Carbon. As this is a major release you'll need to download a fresh copy from the website  - it does not update from Tungsten.

Major new features are:

  1. OAuth - means you can use Twitter transforms again!
  2. New and improved tabular import - create a graph from XLS/CSV  
  3. Discovery of Maltego configs from NTDS (much needed feature for those with their own servers) as part of discovery.
  4. Bug fixes, optimization and new looks. 

Andrew explains of this in the next episode of AndrewTV (click below for this exciting episode):


For server clients - you can download new updated servers from the server portal:

  1. NTDS - now with paired configuration functionality and generic OAuth support. You can simply backup your configuration and load it into the new VM.
  2. CTAS - with new Twitter transforms and fancy *_SE transforms.

Maltego Carbon is live for download from the [Paterva website] right NOW. As always the free community edition of Carbon will follow in a few weeks.

Enjoy responsibly!!!
RT






Thursday, March 20, 2014

Where is my Maltego release?

Good question. We said we'll release this week and we haven't. Actually - truthfully, we said we'll release LAST week and we didn't. There are good reasons for this delay.

The updater broke -don't ask why.... Which means we can't send the new release as an update. Which in turns means we should really have a major new version and new builds - and a new element name. And that's OK, because we've added a whole lot of cool new features into this release. But - it also means that we need to do a bit of rebranding and have a few more edges to smooth out. And all of this takes time. Additionally it's a short week in SA (Human Rights day tomorrow), we finally moved into our Cape Town offices (and had to sort out Internet access) and nobody pushes out a new release on a Friday.

We CAN tell you that we're trying our best to have Maltego CARBON out before the end of the month. Yeah I know...don't even mention it. But you know that it's close when we're deciding on splash screens and testing angles for the associated video.

We'll leave you with the shortlist of Maltego Carbon splash screens: