Chris?
Alright, I'm going to go ahead and get started.
I have a few disclaimers.
First one being, if PowerPoint or Keynote or Martial Art, I am a white belt,
I don't like fancy slides, it's black on white, very easy, no fancy graphics, no videos, it's all low tech.
I might as well be up here with a bunch of slide charts.
So, my name is Christopher Witter, I'm going to talk about Enterprise Packet Capture.
A little bit about myself, I happen to straddle the wire in my day job, I do disc forensics and network forensics,
I happen to like them both almost equally as much.
In my environment I manage two hubs, so I tend to favor the network-based forensics because I have a much wider net.
If I find one thing I can now look for it over a 60,000 user base and maybe find 8, 10, 15, 20,
two other people doing the same nefarious activities, the same emails that came through.
So, I'm going to tell you how to build your own packet capture engine on the cheap.
So, first disclaimer, I'm not a lawyer, I don't claim to be one, I don't play one on the internet.
One of the big things about packet capture is personal information.
By having the package you have everything.
And you're going to hear me reiterate, the proof is in the packet.
We can always go back to disk if we don't have the key for encryption to break those packets,
so we could actually look and see what was done, but if we have the packet eventually we could probably get to the end state and find out what was done.
But the big thing to keep in mind is you have legal obligations maybe for e-discovery.
Your lawyer is going to say, hey, wait a minute, you have all this information,
you're capturing every single email that enters or leaves this network.
That's a problem because if your social security number goes out, maybe a proprietary documentation,
there could be issues with having access to that or controlling access to it.
Who has access to these servers?
One person, five people, is it authenticated, do we have logs, who's doing what with that information?
So why capture packets, as I said, evidence?
I like it for the ability to go back and see what happened.
You can also use it to verify a user's response to an interview.
I did not click on that link, I didn't even open the email.
Well, we have evidence that says otherwise, would you like to recant your statement?
Did you click on the link?
It's not going to get you in trouble, I mean, I'm okay with that.
We just need to know what happened, what made you click on the link?
Why did you think this was a legitimate email?
Go through those motions, basically, hey, here it is, plain as day, or if it's an insider thing,
you might actually need it for actual evidence to prosecute, which is a whole other issue.
But we can get attachments, web pages, emails, network traffic analysis,
I actually started in network engineering.
So having the packets, I can tell, 80% of the traffic on this OC12 is HTTP, okay?
20% is email, that's all of our traffic.
Or we're having a slow web server issue.
We could go back and verify the packets if there's too many SINs, SINs, resets,
look at all the exact statistics of what's happening during that communication.
It's good to have.
Test and develop bids rules, new O-day comes out,
you happen to be one of the few individuals in the entire world that gets targeted,
and you're lucky enough to actually discover it.
Now you can go back and use the attack that happened to replay rules,
or I'm sorry, to replay those packets through your rules to fine tune,
to better identify that traffic.
And if you have the packets, you could actually replay your entire corpus of packets using that new rule.
So if you have 60 days' worth of data, you could say, all right, I just want to apply this rule in Snork,
go through all these packets with just this one rule for its evaluation,
and see if it happened before you discovered it.
Okay, so there's a lot of commercial products out there.
I have experience with a few of them.
I have my favorites, I have my not-so-favorites, but it's actually very, very easy to do on your own.
Cheaply, if you can sit on a Unix shell, you could probably master it yourselves.
A couple of the products out there, WildPackets, Nixon, NetWitness, Solara Networks, and Endace.
The problem with the packet capture world is everybody does something, maybe one or two things, very, very well.
There is no one complete solution where, hey, they all do something very well in multiple instances.
Like, oh, well, they can capture the packets very well, but I can't search them,
or I can search them really well, but I can't back up the data.
They all have their issues, and it's just easier, in my opinion, if you roll your own,
because it's so easy to do and very cost-effective.
You can spend your money on hardware versus licensing somebody's software
when they're just selling you really expensive hardware
and then charging you a really high licensing fee for the software that's running on there,
for their proprietariness, which just sometimes gets in the way.
So the problem is we have the problem, which is the square hole.
Every solution out there in the industry is round,
so there's always a little piece that's missing where, okay, well, my requirements are 90 days of data retention.
I need to be able to search those packets while I'm capturing packets.
I want to be able to run, I want to be able to back that data up.
Commercial Solution might be able to do two of those.
One of them might be able to only do one, so I might be able to do all three,
but then you come up with use cases later on where, oh, man, if I had this data, what I could do with it?
By locking yourself into a proprietary solution, now you have to go back and possibly convert that data.
And if you've ever tried to convert a terabyte of packet data, which is one day for me in some instances,
it takes 8 to 12 hours to process something.
So if I want to run a rule, you're like, oh, yeah, we've got this new O-day, you know,
one of my teammates wrote a snort rule, we're going to run it against the data and see what happened.
Well, you know, we're not going to live until tomorrow.
That's not really an acceptable solution in an incident response case.
Sometimes it's all you have, so it's what you go with.
Okay, so no proprietary formatting of data is like the whole basis of this.
By doing it in a PCAP, you can use it with all kinds of tools out there.
You can write your own tools, there's Python modules, Per modules, C, Wireshark.
I talk about a few of them. There's a lot out there.
It's way cheaper, which is the whole premise behind it.
I'm cheap. Anybody who knows me knows that.
We just ran into an instance of me being cheap. Totally off topic.
But I bought a monoprice adapter for my brand new laptop.
I tested it at home. It didn't work.
I asked Marcus, hey, man, I don't have an idea what's going on here.
He brings over his Apple adapter, which was $20, and plugs it in and it just worked.
So me being cheap actually cost me some time here.
All right, so you control your own destiny.
You can plan for whatever you need.
So I say on the cheap. How much am I going to save?
Hardware is the most expensive piece of this because disks, RAID arrays.
If you go with the hardware route for a capture card, which we'll talk about briefly,
it gets even more expensive.
But you can save anywhere depending on your use case,
the data you're going to be pushing through these boxes.
We'll say $10,000 to $30,000 depending.
And it's all independent. So system design.
Server hardware. You've got to have a current motherboard with 8 gigs of RAM.
It doesn't have to be anything astronomical depending on what you're going to try and do with it.
The more RAM you have, you can create a RAM disk to process data as you're capturing it
before you move it off the disk if you want to do some post-processing.
Backup options. Do I need to save this data or is it just, well, I have seven days,
and I hope whatever I need I can find in those seven days. Otherwise, I'm just totally hosed.
Capture subsystem. We're going to use a hardware-based card,
which is extremely expensive and there's not very many in the market.
Or we could do software-based depending on our data rates
and how much we can appreciate dropping a few packets here and there
if we need absolutely everything.
Storage subsystems. Probably where you'll spend the most money.
You can use a SAN, locally attached disk array, or seven or eight drives in one single RAID 5,
which is going to be slower. You have to do some analysis on that and actually work with it.
Server software. Do we have any BSD fans? Free BSD anybody?
Okay. I happen to love free BSD, but in this use case, unless you go with a hardware-based card.
Yes, sir?
Yeah, yeah. A certain RAID 8 type that is really high performance for the muscle read and muscle writes.
Yes, RAID 10 would be my preferred option, which means basically whatever you do you have to double.
So I want to say seven terabytes of data. I need 14 terabytes of disk to do it,
because I'm going to take and mirror whatever I am writing to disk.
So you get the best advantages of your read speeds and your write speeds.
But in this, for our design, Linux is the way to go, unless you go with an enterprise capture card.
Then you're limited to the drivers, and there are some free BSD drivers for people who are fans.
Okay, so the capture subsystem options.
Hardware, there are basically, and Case Technologies has actually just been recently acquired by another company,
but they make a card. Endace is like the Microsoft of the capture card world.
They have about 70 to 80 percent probably of the market.
A lot of the proprietary products I mentioned early on use one of two things.
It's either an Intel NIC, and they say, oh, well, it can only capture at such and such speed,
because they wrote their own driver, which we'll talk about when we get to the software,
or they're using an Endace card, and they won't tell you about it.
They're going to basically charge you for the card, they upcharge you for the card,
and then they just include it in their appliance, which then they support.
So Endace would be the one I would prefer. It's the only one I have experience with using.
When we talk about software, there are two basic options.
Intel Server NICs are the best NICs on the planet and have been for, I don't know, probably going on 10, 12 years.
They have the most force power built into them, the best Linux driver out of the box.
So software options, libpcap is one of them. Yes, sir?
I'm sorry, I didn't mean to upset you, but these are two separate, these are exclusive options.
Correct.
So does Endace card come with packaged software?
It comes with a driver. Basically, it doesn't use TCP dump.
It has hooks into it where it will, I'm trying to speed up here because I'm shortening my talk a little bit,
but the hardware card basically is an FPGA with 64, 128 megs of flash RAM on it,
and it spools all the packets coming in to the card's memory so it can get it to disk in time.
That's how it doesn't drop it. It's basically buffering everything.
We talk about PF Ring, Entop and a gentleman by the name of Luca Dury out of Italy has a special driver,
which can do similar things with the Intel cards, a little bit of packet loss.
Once you reach really high data rates, we're talking 8, 900 megabits a second,
but for 300 euros, you can get his driver and a $200 Intel NIC
and save yourself versus the $5,000 Endace card and their proprietary driver.
But the advantage of the hardware is it will do line rate and not drop anything,
as long as you can get it to disk in time.
So, capture options, best thing for hardware.
If you have absolutely a need to get one gig full duplex, I can't accept 1% loss, any loss at all.
That's probably your best way to go.
There's less interrupts on the system because it's all spooled on disk.
When I say less interrupts, when a packet is coming in,
the operating system has to address that incoming packet
and interrupts whatever processes are taking place depending on how it's communicating with the card
to transfer that packet out of the network card into memory to write it to disk.
Well, if it's all spooled on a hardware card, it's sitting on that 64 meg or whatever,
it's just doing big memory copies off the card using their driver on the disk.
So, it's doing larger transfers instead of transferring every 1,500 byte frame,
it might do 164 byte chunk to write it to disk.
Software options, cheap, very effective for your cost,
makes it a lot easier to actually build out more boxes, maybe cheaper, to do the same job
and give you more data retention or give you more options to play with.
Cons are the cards are ridiculously expensive, 5K for a card,
you could build a server to do 100 meg a second, not drop any packets for 5K probably,
depending on how particular you are about the hardware you choose.
So, software, slower capture speeds, some packet loss, higher interrupts,
if you choose to use the LuccaDuri driver, that higher interrupts goes away
because actually the Intel cards have built-in buffer on them
and his driver enables direct memory transfers off of the card on the disk
which makes it interrupt the processor a lot less, so it makes it more efficient for your system.
So, here's where you're going to spend the most money, it's the biggest deal,
the storage design considerations.
One, we're writing it to disk, how much traffic are we doing?
Are we writing it to disk only or after it's written to disk do we actually want to use it,
do we want to do something with it?
Do I want to run filters on it, do I want to run a snort roll through it,
am I going to do post-processing?
So, if I'm just writing it to disk, well that's easy,
I just need to take all the hard drives I have, look at their sustained transfer rates,
add it up, do a quick test and assume that I'm going to be able to get that.
If I need to read that data back, say I have a simultaneous capture run 24-7
and I need to go through and maybe filter out some packets, I'm going to look at DNS.
So, I'm going to look at DNS for the last two days, I write a TCP dump,
use some BPF filters, take that information in there,
grab those packets and send them off for analysis somewhere else.
I have to be able to simultaneously read that entire corpus of data to get the information on one out of it.
So, you basically now doubled your IO, don't want to back up that data,
you have to take that into consideration, am I going to be writing to the disk, backing up the disk,
because depending on how much it is, I guess I would do about a terabyte a day,
that requires about eight hours to write the tape.
So, that's my eight-hour window, that's actually when it's the slowest in my environment,
but if I'm actually running a query now, I'm going to impede the ability for the system to function.
Budget, SANs are really expensive, direct-attached disks are probably your best option,
especially for data holds.
If your litigation team says, you can keep these packets, you have to secure them, you have to do X, Y, and Z,
oh yeah, and by the way, we may come to you one day and say,
well, we have this email that was sent on Friday, the exchange team doesn't have a copy of it,
because well, they just don't do that.
It was deleted, we need you to get it off a disk, we want it off of packets, we need to see it,
oh, we have that data, that was just Tuesday, and I have until Monday, so,
all right, we'll take all this disks, put them on the shelf, slap a label on them, they're now evidence,
nobody can touch them.
Direct-attached disks are cheaper, they're effective, and if that happens, you're not losing a $250,000 SAN over some data holds.
Purpose, traffic analysis, we talked about it could be confiscated.
Okay, so legal considerations, I touched on, data retention policies, does your company have them,
how are you impacted by them, who's controlling the data, if they ask for it,
maybe they'll accept a tape backup of the data, so you don't have to give them everything,
you can just write off that particular day.
Data holds, mentioned it, e-discovery worries, you have to address with your litigation team.
Throughput, what are we doing, basically the reading, writing, are we backing it up, take all that in consideration.
We'll go back and talk about how to test that throughput a few slides later.
So, network hardware, I don't know how many people here are familiar with networking,
but it's really, there are a few products you can use, you have dumb taps, basically takes in, mirrors it to a port as out.
Aggregating taps, you put your in and your out, you may put in and out from another device,
and it'll take and aggregate those four ports into say one port, so if you have maybe a 100 meg network,
and you want to take a couple of 100 meg tap points, send them all to the same server,
as long as you don't exceed the throughput capabilities of your box, you can aggregate those all into one gig capture device.
Passive taps happen to be my favorite.
Because taps usually sit in line, that means, hey, traffic in, goes out to a tap, traffic out.
If tap in the middle dies, traffic stops flowing.
In theory, it's not supposed to, in practice it does.
If you don't like getting called at 3 o'clock in the morning because the hub went down,
and now your alerts are going off, and the guy on call is like,
hey, I can't figure out why the firewall can't talk to the production segment on X.
Well, what's in between? Well, the first thing is your tap.
Fiber, it's light. It's basically mirrors inside, you don't have to worry about it dying in the middle of the night.
Your biggest worry, depending on how your cable management is and your data closets, is someone stepping on a cable and breaking the glass.
So, if you have a choice, fiber is the way to go.
Okay, active taps, these are the ones that are also supposed to fail in a closed state,
so that if they do fail, they basically make the two connections electrically,
so you don't have to worry about losing traffic and your sites going down.
Span ports, a lot of switches that are manageable and modern will support one to four span ports.
That basically says you can span a VLAN or another port where you have your data coming in.
You say span it, which means copy to this port out, and then you plug that port into your packet capture engine,
and that's where you save it. It's pretty easy stuff.
Okay, so testing your solution. We've kind of done all the analysis, we've put it together,
we have a rough idea of what it's going to cost, maybe we started buying hardware for a pilot.
How do we test it? Well, we can take some production packets and use TCP Replay to replay those packets through our engine,
basically to see if we're going to be able to capture them.
If we want to test our I.O. load, which is a really, that's where we're going to run into probably the biggest problem,
is getting the data onto the disk or getting the data off a disk.
There are built-in Unix tools or packages you can download if you don't have it built in.
Bonnie or Bonnie+++. You can verify the I.O.
Because if anybody's ever taken a hard drive, and I've experienced this from doing disk forensics,
it says it'll do, oh, well, it's a SATA drive, that's 300 megabyte.
Well, okay, that's like at burst. What is its sustained rate? Because sustained is how much you're going to be writing simultaneously over the period of time.
It's not burst writes, it's continuously writing at this speed.
You'll find out that it may say 300, but by the time you get to the end of the disk, if you try and write to the entire drive,
when you get to the end of the drive, it may be only 90.
And that might not meet your requirements, something you need to test. We can do that.
Smart Bits are an Ixia. Expensive, but if you're looking at doing a multi-branch office deployment
where you're going to have 10, 20, 60 of these, you could rent one of these pieces of test equipment.
They're used in the Garner reports or Mircon reports for a lot of the production manufacturers for equipment
to verify that, yes, this switch can do 10 gig, no loss at X nanosecond.
You can do the same thing with your packet capture infrastructure.
You can rent one of these boxes and simulate HTTP, HTTPS, a 64-byte frame, 1500-byte frame, 9000-byte frame
to verify that it's capable of capturing to your expectations.
But in the end, nothing beats the real world. So until you put it into production, you can only have safe assumptions
on what's going to actually happen.
Okay, so I have the packets. Now what can I do with them?
I mean, you can capture packets with a laptop, but in an enterprise, it's a lot easier to just have it sitting there,
constantly doing it. So we can do traffic analysis, decide where the data is going, source destination pairs,
how many people are doing HTTP, how much of our traffic is HTTPS.
If we're worried about traffic, the patterns change from HTTP to HTTPS being more,
you're not going to be able to look into the HTTPS packets.
You may want to make a business case and say, hey, we might want to look into a man-in-the-middle solution
so we can see what's going on in these packets, so we know what our users are doing
or our adversaries are doing to us over SSL.
We can use it for IDGE rule testing, conversation reconstruction.
We can use it to regenerate net flows to see who talked to what, how many bytes were transferred,
over what period of time, artifact extraction, emails, URLs, attachments, files, DNS traffic,
because logging DNS traffic is ridiculously hard sometimes because of the extreme volume.
If you have the packets, you have the traffic.
Okay, so PCAP tools.
We have, there are a number of free tools which I mentioned.
We have TCP Extract, which basically enables you to say, hey, here's a PCAP,
so I want to look for this packet, or I'm sorry, this packet header,
so you can pull out all the MP3s, all your AVIs.
NGRIP, similar, you can look for ASCII-based information within a package stream.
Wireshark, which used to be ethereal way back in the day.
Snort, TCP dump, Iftop, which will give you top talkers.
NetWitness actually offers a free tool, which I'm going to talk about here.
All right, so Wireshark.
I said evidence, and I use packets every day in my day-to-day job, and it's extremely effective.
I happen to use the solution we're going to talk about after this one,
but Wireshark is also an open source alternative.
It just happens to be, it requires a little more of massaging.
So here's an email that was sent, reconstructed using Wireshark.
As you can see, it doesn't really lend itself to a quick visual analysis for maybe a junior-level analyst,
and that's where the next tool is going to come into play.
A lot of times you may get a phishing email, and you want somebody to be able to look at that.
Sometimes, just by looking at the email, you go, oh, well, they misspelled this word,
that one font in the middle is different, that URL doesn't look right.
You could quickly look at something and go, all right, this is malicious.
To what extent?
We don't know yet, but it's definitely bad.
Okay, so here's a graphical HTTP request, which was reconstructed,
and as you can see, it doesn't really look like anything.
So if a user were to visit a webpage with, say, a malicious banner,
and you pull up that segment or that session, and you're trying to look at the traffic to see what they saw,
maybe they say, well, I didn't do anything.
You know, it was an iframe drive-by, a rotating banner came up with an SWF file that exploited them,
some sort of Adobe attack.
You wouldn't be able to see that easily in this instance.
Here's graphical DNS, again, you know, you're looking at just the simple requests.
This is probably the only thing I'll jump to Wireshark for,
because it does it better than the NetWitness freeware tool.
Someone talked about NetWitness, they have a freeware tool.
It's limited to one gig in capture size, which, depending on your environment, could be like two seconds,
or it could be two days.
Or if you carved out a whole bunch of data for a particular user, you know, it could be an entire week for them.
And you have 25 basic reconstructions you can do with the freeware tool.
Why do I like it so much?
Well, metadata, it gives me source, destination, IP, port, extract, JPEGs, and it does HTML, emails,
and presents it to me on a screen that says how many sessions there were, and I can pull it up quickly and easily,
and actually look at it.
Or better yet, have a junior-level analyst look at it and say, hey, we're flagging this session, somebody else needs to review it.
We can simply pivot into the metadata.
So I click on, show me all this user's traffic to this destination, or just show me all the traffic to this destination.
And it quickly pulls up a next screen with all the sources that went to that destination.
So if there was an attack and you know the URL, you can look across your enterprise or a particular user or a set of users and say,
hey, how many people went there?
Well, you can go through your proxy logs.
Maybe that's not an option.
Maybe you don't even have a proxy.
You could go back and use the packets to reconstruct that.
The session reconstruction, we'll see, simple artifact retrieval.
It's literally just, oh, show me all the files.
Okay, here's files.
Here's the file names.
Click on one, download it.
You can automatically start doing your analysis.
And for people who may do malware analysis, I actually recommend downloading the freeware tool and putting it in your VM where you're doing your reversing.
Because if it's going out to grab other artifacts, you know, secondary payload, yeah, you can get it off a disk.
Yeah, maybe CaptureBat actually caught it before it was deleted and saved it for you.
But you also have the packet.
You can look at the header and say, hey, oh, look, they're using a user agent that's totally different than anything we've ever seen.
Now you can use that to go through all the packets to look for.
Simple rule creation and filtering.
We can take information we have when importing a capture to either exclude packets or include them.
We can write simple rules like, all right, DNS, show me all the DNS that's going to 127.0.0.1, a simple beaconing command.
And flag goes for an alert to basically pop up when you import something.
If you have a base set of rules, it'll populate everything for you.
So my favorite, the metadata.
This is what it looks like, pieces and parts of it when you actually go into the user interface.
But what did you just say when commands are going through the local port?
You're doing another sniffing and looking for commands on the local port?
I'm sorry.
Can you repeat that?
You said you're looking for things that are going to 127.0.0.1.
Right.
So you could use that as a rule.
You could write to say, show me DNS going to one that results to 127.0.0.1.
And when you import your packets, it'll pop up as an alert, which I don't think I have them up there.
But so you see this is a collection I used.
And there's a couple of collections out there on the Internet.
One that I like happens to be from an NSA military exercise with the US Military Academy.
And if you look for a CDX and ITOC, you'll pull up the data set.
It's like 12 gigs of packet capture, snort logs.
Really good to go through and practice.
You could work on your kung fu with packets that way.
So I just imported a simple one.
And you see the green is the sessions.
So that's how many times there was an HTTP session.
DNS, that's how many DNS calls were.
Breaks out the host names.
Source IP addresses, of course, you can see this was done on actually, I think it was from my local Neto's, I was feeling.
The destination, so where all the traffic's going.
How many sessions there were going to that.
So if you knew that, all right, the secondary payload for this exploit was from this URL or this remote IP address,
you could simply pivot on that and open up all those sessions.
By clicking on a session, it would give you all the packet information.
And you can go in and look at every single one individually.
Breaks out email addresses, the to's, the froms, your subjects, attachments, file names, and all this again,
when you import the packets into the Net Witness tool, it's done automatically.
So it kind of like does a lot of the legwork for you.
And then you can get to the investigation a lot faster.
OK, so here's that same email we looked at in Wireshark, just to give you an example of how much more it would probably look like someone
when they open it in, say, Outlook Express or Outlook, it's going to, this is how it's going to actually display.
They do a reconstruction of it for the for the analysts.
And if there were attachments here, you would see them right here and you could download and grab those.
URLs would be reconstructed in line images.
Here's where, like I said, I prefer Wireshark for DNS because you see it doesn't really lend itself to being read very easily.
But there happens to be a button right there.
So you can say open it as a PCAP.
So even though you imported the packets, you can also export them back out just that individual session.
You'll open it up in Wireshark and you can use Wireshark for what it's best at.
Here's a reconstruction of part of the page that we looked at earlier.
And you see, if you show that to somebody, that looks a heck of a lot better to interpret than maybe, say, all those characters on the screen from the transfer.
And even if it was just an ASCII or HTML, it's a lot easier when you see something like this to go, hey, wait a minute.
Well, there's a space where there's this banner ad that should be there, but it's not because it will not reconstruct, like, malicious JavaScript or Adobe.
It'll give you the option to download that file, though, but it will not actually render it for you in case it's malicious, which is good.
OK, so artifact retrieval makes it really easy.
In this instance, I had an attachment and basically click on it and you can save it to disk and start your analysis right there if you wanted to or export it off and analyze it on another machine.
Makes it pretty easy.
OK, so I talked about the alerts.
One of the popular things that I like to use for alerts because I use this on an enterprise level, not the freeware version, but all that can be is like backward applicable to the freeware version is I write an alert basically says show me anything in the bogon net.
So that'd be like 127, 172.16, looking for basic sleep commands because I'm an enterprise environment.
I'm not going to see non. I should not see non-rattle addresses be returned in a DNS request or be trying attempted to be used at all.
That could indicate a something's misconfigured.
B, there's some malware maybe on the network that's being told sleep, kind of like some of the people in the audience.
So, and it's really easy to write on, you know, like this one here for the end.
And that's a slight type on my, you know, it's not DNS buy one get one off.
It should be bogon, but service equals 53.
So I want show me DNS and where the alias IP or basically the IP address is we started out at 127 network, 00169.
So the illustrious Microsoft I can't find a DHCP server address space 172.16.
And I have a list of them that I use.
It's pretty effective occasionally at finding things and flagging them when you're not going to look for things.
One of my other favorite features is the timeline function.
You see these little dots where it's there's more sessions.
If you're looking at just an individual user's traffic, this is actually DNS graph, by the way.
Anything peculiar right there.
This is actually for a group of posts, but you see how the session number, which is basically how many times communication was made, is at a regular interval.
OK, that's not something that's easy to do on a command line to show, but that gives you an automatic like, hey, what is that?
And you can easily, you know, open up those sessions and look at them or highlight the area and zoom in on it.
So you can see and go back to those packets, see the source, the destination, maybe the host they were going to, the domain names that were resolved during that time and investigate them easily.
But this actually happened to be a beaconing pattern.
And from a statistical analysis point of view with DNS, these are off.
Actually, they're not in an exact, you know, 60 minutes or 30 minutes or whatever the time interval was.
It was like 61 one time, 60 the next, 63 the next time.
It was off just enough that if you're trying to show me something that's extremely repetitive, you would miss.
But by looking at it visually, it's awesome.
Yeah, for me at least.
And that's it.
Wow, I flew through that.
All right. So any questions?
Wow.
Oh, wait, in the back.
All the way back first.
So one of the problems I had was when I told them that a packet dumps from an HTTP stream that some of the web servers are caching in the data.
Wireshark says when you make use of it, when you use other tools, unpress that screen so you can search the screen.
Yes.
Where does Wireshark do it?
I was going to say, well, I'm not sure about Wireshark.
You probably can save those packets off, actually export the payload then if you wanted to extract it with Gzip.
But NetWitness will automatically reconstruct it and do that for you.
We'll un-Gzip it, which in the slide that I had where I showed you the graphical piece.
Because when you say that the HTTP stream is Gzip, they're actually zipping it up on the server side to save themselves bandwidth.
When it gets to your client, it's unzipped automatically.
NetWitness does that.
Wireshark just shows you straight up this is the packet.
This is what it looked like on the wire.
And you can look at it either way in NetWitness.
You can see there's a reconstruction button where you click and you can say, oh, show me Hex, show me ASCII,
or show me a reconstruction, which will basically try and reassemble everything as best it can.
Go ahead, Adrian.
Have you seen Network Miner?
Network Miner?
It's an open source tool.
It's Windows only.
Yes.
Yes, I know what you're talking about now.
Yeah, it's incredibly nifty for dumping things out of CAPTCHAs you do from FTP, HTTP, or SMB file transfers, OS detection.
Yeah, it has a lot of neat functions.
Yeah, it does the password and CAPTCHA stuff, but other things do that better.
But for those functionalities, the OS detection, because it uses the fingerprints from EtaCAP and POP and a ton of other tools,
all built into this one little easy to use interface.
Yeah, no, I have seen it.
And it's Windows only, which is handy.
Network Miner is a tool that Adrian just mentioned.
It kind of does some of the same things, similar to NetWitness, maybe not as extensible, but it's open source.
You don't have to worry about packet capture size or limitations to number of sessions.
There's another question in the back.
Yeah.
It seems like having that traffic data stored on disk can kind of help you combine the nice way to leverage it to be filtering some kind of data loss prevention technology.
I guess that data could be encrypted in zip, myconv.gatsby, binary, maybe even on speed.
Right. All right, so it was mentioned that having all the packet data, maybe you could do some sort of DLP on it.
And I think that's actually entirely possible.
But one of the biggest problems with any DLP solution, I haven't actually implemented one, but I know many people that have, is you have to mark all the data.
So you need to mark it as proprietary, confidential, and then something actually looks at it, maybe in your email stream before it goes out,
looks at all the attachments for those markings, which may be proprietary.
So you would have, it's entirely doable, but you'd have to find a way that you could mark a document so that it has a unique package string.
And in that case, you could use snort for that matter.
If you're looking for a particular byte string that says, this is XYZ Acme Company proprietary, and you can see how to see that on the wire,
then you could just use snort at that point, which you could do with having the packet data in.
DLP will also recognize signatures, so something that looks like a social security number is smart.
And intellectual property, et cetera, if you have to do it at whatever your internet speed is, that's a huge, huge problem.
And in the enterprise instance of NetWitness, you can do that.
They have what they call parsers.
So they break out there, one of the things that I actually forgot to touch on when I was flying through this thing that I'd like to offer is,
they take and put the protocol parsing.
So HTTP, we know what it looks like.
SSL, we know what it looks like.
If they find something they don't know what it looks like, an X-word string, an encoded command shell, they flag that as other.
Okay?
So you could easily just show me all the traffic from this machine we think it's infected with, Zeusbot, or some other advanced threat.
That may be doing some other sort of communication, but we can't see anything.
We don't know what it's doing.
It may flag, say, hey, well, there's actually, it's going over port 80, or it's going over port 443, but it's not actually SSL or legitimate HTTP traffic.
It'll flag it for you, and you can easily just go, oh, show me those strings.
And like I said, if you have the packets, you go back to the machine, you have one of your malware ninjas, you know, figure out what the key is,
you can take and go back and break the payload and actually see everything that was transferred.
Oh, they exfiltrated data.
Well, how much data?
Well, we don't know how much, but this is just how much we have.
And then you can see what they were exfiltrating, where are files, documents, you know, your company holiday party schedule, you know, how much money was spent on booze, who knows.
Any other questions?
How are you actually storing the pcap data?
Is it just the binary pcap?
Yes.
Are you using like a hard disk or something like that?
The hard disk is just a binary pcap because the problem is if you try to manipulate it at all, now when you want to go through one of those scenarios where you want to replay it in snore or something, you have to – you need a pcap anyway.
So if you just save it in pcap, in my professional opinion, it makes it just easier because then you can apply all kinds of things to it.
You throw it into a database, you have to get it out of that database.
That's additional IO, additional storage.
Databases have overhead.
You know, it's just easier flat files.
A 512-meg file can be written to disk fairly easily, and if you have enough disks, it can be read off a disk in a second, you know, or a nanosecond actually.
You know, it's ridiculously fast and easy to do.
So I recommend just pcaps.
One problem you'll run into is, you know, disk structure.
Maybe you have to do it by day or something because when you get a – depending on how much you're trying to save, you don't want a directory with 10,000 files in it with just time and date stamps.
It becomes kind of overbearing.
All the way in the back.
If you saw a big, double-dap, big nose somewhere, be careful because sometimes the drivers are adjusting the packets if you get to them.
Nice.
So it was mentioned that be careful with Windows-based packet captures because the drivers may be modifying the packet so you're not getting the true packet information.
One of the nicest things about network modifiers, though, you can capture in Linux or some other tool, just make a PCAP file, then dump it into there.
I've been having a lot of problems getting a wireless card that actually supports permissive mode correctly.
And I do mean permissive mode, not monitor mode.
Angry issues when I have that discussion with people.
But I mean, I have a hard time finding a wireless card that truly supports permissive mode.
I've seen those muck with things as far as packets, but you've seen straight-up regular Ethernet cards in Windows that mess with the packets in some major way?
Wireless?
No, no.
Wireless, I've seen some issues.
Using straight-up Ethernet adapters in Windows.
What kind of modifications are you talking about, the packets?
I mean, something like, for example, if you have a VLAN, they strip that VLAN off so you don't see the VLAN as you would see a straight VLAN, it disappears.
Because of the frame size on the machine, the driver doesn't support it or something like that.
Which could be bad.
It might be where it's looking at too, because you'll see in Windows a lot of times, you'll see the local capture.
I don't remember my house in the local capture where you have to check some stuff, which is the way it's happening.
But if you're doing a permissive step, maybe that could be something.
Right.
Again, in an enterprise scenario, I would recommend using a tap anyway, which is going to give you everything.
If you use a stand, you couldn't run into issues with not getting the complete picture.
Go ahead.
Because it's a private industry, in terms of your storage of your data, how do you guys determine your length of time to get stored?
Unlike with government agencies or facilities where they have requirements, how long do you determine a proper storage time frame and keeping that data secure from other individuals in the event of a conference?
Right. That is a big question.
So the question was how did we determine our rate or length of time for data retention?
How long do we want to keep the packets and how do we secure them?
So it's really hard, but we actually have kind of an indefinite storage policy right now.
We're at over 120 days.
Kind of we had to recycle tapes, but our ultimate goal is a year to write the tape, write the packets off the tape, and then we can go back and reference them.
So we can use them for an investigation if we find out three months later something happened.
We can go, oh, OK.
Well, go back to the tapes, restore those days.
The problem with that scenario is you actually run into a scenario where it's better if you could keep it in line or near term storage on disk because a terabyte of data takes a while to write the tape.
Now you have to read it back from tape or read the indexes back to restore it.
It's going to take eight hours, not eight man hours, but eight processing hours to restore the disk.
OK, that's one day.
Now you need a whole week's worth of time, or you're talking months.
All right, well, this has gone back to we found it this week.
How long has it been going on?
You have to incrementally go back through it, and it takes a ridiculous amount of time.
It's just you have to make those concessions.
Your litigation team may have some input on that and say, hey, we don't want you keeping anything.
We don't keep logs any longer than 60 days or 120 days because of this.
So that'll make it easier for you.
And in the event they say, well, it's on tape.
Now somebody else is responsible.
Hey, we have the evidence here.
You go find it.
It's not my problem.
I'm not paying somebody to restore tapes for the next three month time period.
It's going to take them almost three months to restore, or a month and a half to restore the data,
depending on how well you can get it off of there.
It's a difficult business decision, but we started at 90 and we're just too confident.
Any other questions?
Awesome.
Thanks.
Applause