Friday, March 13, 2015

Connecting the links


Hello there,

Today I am going to talk a bit about our new Linkedin transforms that we have been working on. Linkedin is all about finding connections between people so what better way to visualize this information than in Maltego. I set out to build some Linkedin transforms that could help show connections between Linkedin users, their shares and company profiles that may not be easy to identify on Linkedin itself. All the transforms that I built here use the Linkedin developer API so you can log into your own Linkedin account from Maltego and start visualizing your Linkedin network.

Linkedin's API provides awesome search functionality for finding people and companies by allowing you to refine your searches with additional search parameters making it a lot easier to find profiles with common names. Our Linkedin transforms allow you to enter these additional search parameters using transform settings (transform pop-ups). To search for a Linkedin company profile from within Maltego you will start with a phrase entity and run the transform Linkedin Company Search, a transform setting will pop-up asking you if you want to specify a county-code. Running this transform on the phrase ‘KPMG’ without specifying a country-code results in the graph below:
The results returned from the Linkedin Company Search transform are weighted according to relevance meaning that the entity in the top left-hand corner is the most relevant result for your search. In the detail view there are links to the company's Linkedin profile page and to their website as shown in the image above. There are a range of transforms that you can now run on the Linkedin company profile entity which are listed in our shiny new context menu also shown in the image above. One of the highlights of these transforms is the To Email Domain which returns domains that the company has specified they receive email on. This transform often returns loads of results which is great if you are looking for sub-domains for that company. Running the To Email Domain transform on the first company profile from our 'KPMG' search results in 34 different email domains many of them being sub-domains of kpmg.com. The result is shown below:
If you are ever looking to mine email addresses for a company this is probably a good place to start but that is a bit off topic for this post so I will leave that for you to try on your own. 

To search for a person’s Linkedin profile from Maltego you run the Linkedin People Search transform on a person entity, three transform settings will pop up allowing you to specify this person's company name, the country code of their home country and a past school of theirs. These transform settings are really useful when searching common names, for example when searching the name John Doe while specifying a country-code IR (Iran) you will receive only two Linkedin profiles. If you had to exclude the country code from this search you would be flooded with results. The image below shows this search result as well as the context menu which shows all the transforms that can be run on a Linkedin Affiliation entity:
The Detail View in this image shows additional information about the user that is selected which includes their Linkedin headline, location and the industry they work in.


Currently the Linkedin People Search transform returns the 25 most relevant results for your search while the Linkedin Company Search transform will return the 20 most relevant company profiles for your search.

Okay enough with the details, let’s move onto an example of how this can be used: imagine you wanted to inform as many Linkedin users from a particular company of something without directly messaging them and without them being aware that they are being fed targeted information. How we could do this is as follows: start by finding our target company's Linkedin profile, from our target company's profile we then run the transform To Affiliations [only in your Network], this transform will return all the users in your network who work (or worked) at that specific company. This results in the following graph:

From all these users we then want to see what shares are currently showing in their news feed, to do this we run the transform To Shares in User’s Network. This results in the following graph (shown in bubble view on the left):

This graph is quite large but by selecting all the shares and ordering them according to their number of incoming links we find that there is a single share that is currently on 23 news feeds belonging to users at our target company. Taking this share plus its incoming links to a new graph results in the following:

Now if we were to post a comment on this share we know that our comment would show up on 23 Linkedin user's news feeds that work (or worked) at our target company.

Next we want to find who authored this share, to do so we run the transform To Share’s Author on this share which reveals who it was initially posted by. Finally we run the To Companies transform on this user that reveals the company that this user works for:

This user’s Linkedin profile seems to be quite popular amongst users from our target company so its owner may be a person of interest if we were really targeting this organization. The next step would be to find this profile owner's email address which could be done by finding the companies email address domain and then their naming format for their email address but again this is out of the scope of this blog post.

I have one last highlight from our new Linkedin transforms that I want to mention before its time to go. The To Entities [Using AlchemyAPI] transform can be run on a Linkedin share entity, this transform will extract people’s names, places and company names that are mentioned in the share article. It is a nice way to easily identify topics that are being discussed across multiple shares in your Linkedin network.

A quick word about rate limits on the Linkedin API, to use these transforms you will need to log into your Linkedin account from Managed Services in Maltego, most of the API calls that these transforms use are limited to around 300 calls per day per user, when you reach your limit for the day you will receive a message in your transform output notifying you and you will have to wait until midnight UTC for your limit to be reset for your account. The Linkedin People search and the To Affiliations [in your network] transforms have a much stricter limit so you might find that you reach the limits for these transforms a lot quicker.

For those of you who have upgraded to Maltego Chlorine the Linkedin transforms will be arriving in your Transforms hub shortly, you will be able to add them to your Maltego client simply hitting the install button. For those of you who are still running Carbon here is the seed-url:



Enjoy responsibly

PR.

Tuesday, March 3, 2015

Maltego Chlorine is ready for download

All,

TL;DR: 
New release is called Chlorine - was a tough one. It's an awesome release. We fixed many bugs and built many features.

You should download it. Now. [ Here ].  Or click on the pretty picture.
Release video is [ here ].



The full story

Here at Paterva we've had a few milestone Maltego releases. Maltego 3.1 was one, Maltego Tungsten was another. It's hard to say which one was the most difficult to get 'over the line'. Maltego Chlorine was one of those 'giving birth' releases.

We've worked really hard at it. The release was supposed to be in mid February - then we delayed it because we kept finding conditions we've previously missed. A lot of testing was done on Chlorine, and a lot of bugs (some even came from version 3) were fixed. We even [started talking] about it early in Feb.

A product like Maltego is never really completely finished. At any given stage there is a list of features we still want and a (smaller) list of things that really annoys us. We can easily develop Maltego for months before we push out a new release, but at some stage you need let go and put it 'out there'. We're there now - it's 10 months since our last major release and the baby is overdue.

We made a video describing what's new in Chlorine. The plan was to take the cable car up to Table Mountain and shoot the video at sunset overlooking Cape Town. It started raining during the first take. There was a pesky helicopter buzzing around (because it was State of the Nation address in parliment that day). It was shot on the 12th of Feb - almost 3 weeks ago. As such the look and feel changed a little bit here and there - but you'll get the basic idea. Click below to watch what Chlorine is all about:


New features
As the video says - the major features of Chlorine are as follows:
1) Transform Hub
2) New context menu (right click menu)
3) Java 8 support - and lots of OSX install/first run enhancements

What the video does not say is that we now have:
4) Sizable fonts (no more needing a microscope to read detail view)
5) Output window shows links to entity for easy tracking
6) Removed our branding from the PDF report (SO many people, SO angry)
7) LOTS of bug fixes
8) New branding, higher quality icons / logos etc.
9) Not really a feature of the release, but we have a brand new [developer website].

A short history of Maltego releases:
We've also realized that people have difficulty following release dates, names and features. So here goes:
Jun  2010 - 3.0 - NoName - First major release, redid graphing engine, new protocol.
Feb 2012 - 3.1 - NoName - Basically redid version 3...graph annotations, link styles

We then decided to use element names for the releases:
Sep 2012 - 3.3 - Radium - Scriptable transforms (machines), auto update
Aug 2013 -3.4 - Tungsten - Real time graph collaboration
Apr 2014 - 3.5 - Carbon - OAUTH capabilities - return of Twitter transforms
Mar 2015 - 3.6 - Chlorine - Transform hub / context menu

In between the major releases there has been a lot of on-the-fly updates, patches, hot fixes etc.

It's been a long and interesting journey. We hope you enjoy using our software as much as we enjoy building it.

So long / baby seals / going to sleep for a week,
RT

Wednesday, February 11, 2015

Building your own LovelyHorse monitoring system with Maltego (even the free version) - it's easy!

Someone linked me to the [LovelyHorse] thingy. If you missed it - it's basically a GCHQ NSA document that was leaked containing a list of a few security related Twitter accounts that the GCHQ NSA was supposedly monitoring. Seeing that, since the last release, we have some interesting Twitter functionality in Maltego, I figured it be interesting to see how we can replicate their work.

First - manually

Before even starting with Maltego I first spent some time thinking about what I really wanted from this and did it all by hand (still in Maltego, but before we start to automate the process). As a start I'd need to get the people's Twitter handles. Well that's easy - the document lists them all. In Maltego I can start with an alias and run the transform 'AliasToTwitterUser' to get the actual Twitter handle:


I want to get the Tweets that the people wrote. There's a transform for that too - 'To Tweets [that this person wrote]'.


OK great - now I have the last 12 Tweets (my slider was set to 12). What information can I extract from the Tweet itself - keeping in mind that I want to end up doing this across 36 different handles? Well - possibly extract any hashtag, any URL mentioned in the Tweet and any other Twitter user's handle. There are transforms for all those.


Running those on a single Tweet we get something like this:


Note how the http://t.co links are nicely resolved. This Tweet didn't contain any other aliases - so you only see the hashtags and the URLs.

If we select all the Tweets and run the 3 transforms across them we see that there are some matches - in this case on hashtags 'infosec and malware':


Now, this is not really interesting at all but it's a starting point. When I do the same for the last 12 Tweets of all of the lovely horses (as I'll call this group of Twitter handles) I might see some pattern.

Now - all the horses

I copy the text from the PDF and paste it into a text editor - Notepad will do. Clean it up a bit and we have:

Select all and paste into Maltego. It will result in every line being mapped as a phrase. Select all the entities (control A) and change the type to 'Alias' and run the 'AliasToTwitterUser' transform on all the phrases - like we did at the beginning, except now we're doing it on all the aliases. It should look something like this:

At this stage I can get rid of the Aliases because I am not going to use it anymore. I click on 'Select by Type' on the ribbon (Investigate) and select 'Alias'. Delete - and they're gone. I do a re-layout, select them all and run the 'To Tweets [that this person wrote]' - this time on all of them. Essentially I am repeating the entire process we did - but this time on all the lovely horses.

When the transforms complete the graph now looks like this:


All that's left is to run the 3 transforms (Pull URL/hashtag/alias) on all the Tweets. To select all the Tweets quickly I use 'Select by type' - Twit again. This takes a while to complete...but when Maltego has pulled out all the hashtags, URLs and aliases from the last 12 Tweets of all the lovely horses it looks like this:

No doubt this looks like ass. It's because the block layout is not really suited for this type of graph. But click on Bubble View:


and you get:


Let's get real

I wont lie - I've been spoon feeding up to now. Let's stop now - else this blog post is going to morph into a book. I am going to assume that you have a bit of Maltego experience under the belt by now. 

The way we've been doing up to now is really not terribly interesting or accurate. We're getting the last 12 Tweets. What we really want is all the Tweets in the last X seconds. Imagine that one horse hasn't been on Twitter in 14 days - then matching his/her Tweets to what's happening right now does not make a lot sense (in a monitoring scenario). We need to introduce the idea of a sliding time window. The Twitter transforms did not support that.

Didn't. Does now. Well - the 'To Tweets [that this person wrote]' does now. I hacked it quickly. Anton will not approve...but it works as it says on the tin. I've added a transform setting called 'Window' - by default 0 but when changed implements this 'in the last X seconds'. 

Now it becomes interesting when combined with machines (scripted transforms) - especially perpetual machines.  Consider the following machine:

machine("axeaxe.LovelyHorse", 
    displayName:"LovelyHorse", 
    author:"RT",
    description: "Simulates the GCHQ's LH program") {

    onTimer(240) {
        type("maltego.affiliation.Twitter",scope:"global")
        run("paterva.v2.twitter.tweets.from",slider:15,"window":"1800")
        paths{
            run("paterva.v2.pullAliases")
            run("paterva.v2.pullHashTags")
            run("paterva.v2.pullURLs")
            run("paterva.v2.TweetToWords")
        }

//half hour + half hour = one hour
        //the entities will be deleted if older than half hour
        //but the transforms time frame adds another half hour
        age(moreThan:1800, scope:"global")
        type("maltego.Twit")
        delete()
        
        age(moreThan:1800, scope:"global")
        incoming(0)
        outgoing(0)
        delete()
    }
}

Let's take a look. We run our sequence every 6 minutes ( 4 x 60s = 240s). We grab all the Twitter handles and get the Tweets - but 15 in total and only if it was written in the last half an hour (30 x 60 = 1800s - we set the window parameter to 1800. After this it's plain sailing - we get the Aliases, hashtags and URLs. 

At some stage we need to get rid of old Tweets - else our graph will just grow and grow and grow. So there's a little logic to delete nodes when they're older than half an hour. This  means that at any stage we have a one hour view on the activity of the horses. One hour - because on the limit the initial transform can contain a Tweet that's 30 minutes old and it will stay on the graph for another 30 minutes. 

The resulting graph will show us when they are Tweeting the same keyword (courtesy  of the 'TweetToWords' transform), hashtag, mention the same website or mention the same Twitter handle in their Tweets. And if they are not active on Twitter - then the graph won't contain outdated info. 

Of course - the values can be changed depending on how closely you want to monitor the situation - if the resolution is a day then the values should be (24 x 60 x 60) /2 and you should 1) up the number of Tweets returned in the slider value (as TheGrugq Tweets waaay more than 15 Tweets in a day) and 2) you shouldn't have to poll every 4 minutes.

Advanced
Right - so what we REALLY want is something that can tell us when we more than X horse's Tweets are linking to the same thing (be that a website/hashtag/whatever). For that we can't just use the 'incoming()' filter because one person could be sending ten Tweets mentioning the same website and it would mean that the website has ten incoming links. No - it has to have unique starting nodes (the horses).

We have that filter. It's called 'rootAncestorCount()'. So now - with a combo of bookmarks and this filter hackery we build something like:

        //if an entity links to moreThan 2 horses & 
        //we haven't seen it before  - mail
        incoming(moreThan:1, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmarked(1,invert:true)
        run("paterva.v2.sendEmailFromEntity",EmailAddress:"roelof@paterva.com",EmailMessage:"Multiple horses mentioned: !value!",EmailSubject:"Horse Alert")
       
        //this is to ensure we don't email over & over
        incoming(moreThan:1, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmark(1)


Basically what happens here is that we check for all entities with more than one incoming link (this can only be hashtags/URLs/words/aliases) and find the ones that have more than 2 unique grandparents (e.g. horses). If we find them, and we haven't seen them before (this Boolean flag is implemented with a bookmark) we mail the value out. We do the mailing with a transform that we wrote (and for obvious reasons cannot make public else it will be used for spam). It's not rocket science tho.

Such a machine can run for days...resultant graph for today (it's almost midnight), when configured with a one day window looks like this:


Highlighted with entire path here is the hashtag 'security' - no surprise here. The other one was the alias DaveAitel (again not suprising). Below is the email received. Remember that we'll only receive email ONCE per alert, that it's only when 2 or more horses links to it and only if it happened within a day.



The complete machine looks like this (please change values as needed - speed / resolution /etc):

machine("axeaxe.LovelyHorse", 
    displayName:"LovelyHorse", 
    author:"RT",
    description: "Simulates the GCHQ's LH program") {

    onTimer(600) {
        //find Twitter handles on graph
        type("maltego.affiliation.Twitter",scope:"global")
        
        //run to Tweets transform
        run("paterva.v2.twitter.tweets.from",slider:30,"window":"43200")
        
        //extract Alias/Hashtags/URLs and uncommon words
        paths{
            run("paterva.v2.pullAliases")
            run("paterva.v2.pullHashTags")
            run("paterva.v2.pullURLs")
            run("paterva.v2.TweetToWords")
        }

        //if an entity links to more than 2 unique horses & 
        //we haven't seen it before  - mail it out
        //comment this entire section if you don't have a mailing transform

        incoming(moreThan:1, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmarked(1,invert:true)
        
run("paterva.v2.sendEmailFromEntity",EmailAddress:"roelof@paterva.com",EmailMessage:"More than 2 horses mentioned: !value!",EmailSubject:"Horse Alert")       
        
        //this is to ensure we don't email over & over
        incoming(moreThan:2, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmark(1)
        
        
        //delete nodes when they grow old
//half hour + half hour = one hour
        //the entities will be deleted if older than half hour
        //but the transforms time frame adds another half hour
        age(moreThan:43200, scope:"global")
        type("maltego.Twit")
        delete()
        
        age(moreThan:43200, scope:"global")
        incoming(0)
        outgoing(0)
        delete()
    }
}

I hope you've enjoyed this (waaaaay too long) blog post on how our thinking goes. Of course - you get a lot more understanding of these things if you do it yourself. All of the above functionality exists in the (free) community edition of Maltego too - although there you probably want to monitor shorter intervals (say 15 minutes) as you can only display 12 Tweets per person. All in all - that's probably better.. ;)

'laters,
RT

PS: for more information on machines check out our newly built dev portal at [http://dev.paterva.com/developer]. The machine syntax etc. is located in 'Advanced'. 

And also - we made a video some time ago that shows the same kind of principle - it's [ here ]