Can Beehive detect a Snowden-like actor?
up vote
51
down vote
favorite
In a seminar, one of the Authors of Beehive: Large-Scale Log Analysis for Detecting Suspicious Activity in Enterprise Networks said that this system can prevent actions like Snowden did.
From their articles' conclusions;
Beehive improves on signature-based approaches to detecting security incidents. Instead, it flags suspected security incidents in hosts based on behavioral analysis. In our evaluation, Beehive detected malware infections and policy violations that went otherwise unnoticed by existing, state-of-the-art security tools and personal.
Can Beehive or a similar system prevent Snowden type action?
malware antimalware corporate-policy detection incident-response
|
show 6 more comments
up vote
51
down vote
favorite
In a seminar, one of the Authors of Beehive: Large-Scale Log Analysis for Detecting Suspicious Activity in Enterprise Networks said that this system can prevent actions like Snowden did.
From their articles' conclusions;
Beehive improves on signature-based approaches to detecting security incidents. Instead, it flags suspected security incidents in hosts based on behavioral analysis. In our evaluation, Beehive detected malware infections and policy violations that went otherwise unnoticed by existing, state-of-the-art security tools and personal.
Can Beehive or a similar system prevent Snowden type action?
malware antimalware corporate-policy detection incident-response
36
Simple answer: No, most certainly not. Snowden was someone who had privileged access and had the authority and reason to mass-download content (he was a sysadmin).
– forest
2 days ago
3
But in the training case, they model everybody according to their behavior. So, after the training, a mass download will be a behavioral change that will produce an alert signal.
– kelalaka
2 days ago
6
Unless mass-downloading is 1) not common and 2) it's not possible to just throttle the download.
– forest
2 days ago
13
Why "mass download" is even considered suspicious. there are will be some sorts of constant "mass" downloads during everyday usage, was my first thought. What is mass download? 1 MB? 500 MB ? 5 GB? 500 GB? ...
– Croll
2 days ago
8
@Croll If your organisation has one million files, any one person probably doesn't need to access anywhere close to that many in order to do their job (most files won't be related to their work). If somebody starts trying to download all one million over a day or two, that's suspicious. Even a small percentage of that one million could be suspicious. 1% of one million is 10,000 files. How many people working for your organisation need to access 10,000 files over the span of 48 hours to do their job? Very few (if any).
– Anthony Grist
2 days ago
|
show 6 more comments
up vote
51
down vote
favorite
up vote
51
down vote
favorite
In a seminar, one of the Authors of Beehive: Large-Scale Log Analysis for Detecting Suspicious Activity in Enterprise Networks said that this system can prevent actions like Snowden did.
From their articles' conclusions;
Beehive improves on signature-based approaches to detecting security incidents. Instead, it flags suspected security incidents in hosts based on behavioral analysis. In our evaluation, Beehive detected malware infections and policy violations that went otherwise unnoticed by existing, state-of-the-art security tools and personal.
Can Beehive or a similar system prevent Snowden type action?
malware antimalware corporate-policy detection incident-response
In a seminar, one of the Authors of Beehive: Large-Scale Log Analysis for Detecting Suspicious Activity in Enterprise Networks said that this system can prevent actions like Snowden did.
From their articles' conclusions;
Beehive improves on signature-based approaches to detecting security incidents. Instead, it flags suspected security incidents in hosts based on behavioral analysis. In our evaluation, Beehive detected malware infections and policy violations that went otherwise unnoticed by existing, state-of-the-art security tools and personal.
Can Beehive or a similar system prevent Snowden type action?
malware antimalware corporate-policy detection incident-response
malware antimalware corporate-policy detection incident-response
edited 2 days ago
Johnny
468113
468113
asked 2 days ago
kelalaka
4511310
4511310
36
Simple answer: No, most certainly not. Snowden was someone who had privileged access and had the authority and reason to mass-download content (he was a sysadmin).
– forest
2 days ago
3
But in the training case, they model everybody according to their behavior. So, after the training, a mass download will be a behavioral change that will produce an alert signal.
– kelalaka
2 days ago
6
Unless mass-downloading is 1) not common and 2) it's not possible to just throttle the download.
– forest
2 days ago
13
Why "mass download" is even considered suspicious. there are will be some sorts of constant "mass" downloads during everyday usage, was my first thought. What is mass download? 1 MB? 500 MB ? 5 GB? 500 GB? ...
– Croll
2 days ago
8
@Croll If your organisation has one million files, any one person probably doesn't need to access anywhere close to that many in order to do their job (most files won't be related to their work). If somebody starts trying to download all one million over a day or two, that's suspicious. Even a small percentage of that one million could be suspicious. 1% of one million is 10,000 files. How many people working for your organisation need to access 10,000 files over the span of 48 hours to do their job? Very few (if any).
– Anthony Grist
2 days ago
|
show 6 more comments
36
Simple answer: No, most certainly not. Snowden was someone who had privileged access and had the authority and reason to mass-download content (he was a sysadmin).
– forest
2 days ago
3
But in the training case, they model everybody according to their behavior. So, after the training, a mass download will be a behavioral change that will produce an alert signal.
– kelalaka
2 days ago
6
Unless mass-downloading is 1) not common and 2) it's not possible to just throttle the download.
– forest
2 days ago
13
Why "mass download" is even considered suspicious. there are will be some sorts of constant "mass" downloads during everyday usage, was my first thought. What is mass download? 1 MB? 500 MB ? 5 GB? 500 GB? ...
– Croll
2 days ago
8
@Croll If your organisation has one million files, any one person probably doesn't need to access anywhere close to that many in order to do their job (most files won't be related to their work). If somebody starts trying to download all one million over a day or two, that's suspicious. Even a small percentage of that one million could be suspicious. 1% of one million is 10,000 files. How many people working for your organisation need to access 10,000 files over the span of 48 hours to do their job? Very few (if any).
– Anthony Grist
2 days ago
36
36
Simple answer: No, most certainly not. Snowden was someone who had privileged access and had the authority and reason to mass-download content (he was a sysadmin).
– forest
2 days ago
Simple answer: No, most certainly not. Snowden was someone who had privileged access and had the authority and reason to mass-download content (he was a sysadmin).
– forest
2 days ago
3
3
But in the training case, they model everybody according to their behavior. So, after the training, a mass download will be a behavioral change that will produce an alert signal.
– kelalaka
2 days ago
But in the training case, they model everybody according to their behavior. So, after the training, a mass download will be a behavioral change that will produce an alert signal.
– kelalaka
2 days ago
6
6
Unless mass-downloading is 1) not common and 2) it's not possible to just throttle the download.
– forest
2 days ago
Unless mass-downloading is 1) not common and 2) it's not possible to just throttle the download.
– forest
2 days ago
13
13
Why "mass download" is even considered suspicious. there are will be some sorts of constant "mass" downloads during everyday usage, was my first thought. What is mass download? 1 MB? 500 MB ? 5 GB? 500 GB? ...
– Croll
2 days ago
Why "mass download" is even considered suspicious. there are will be some sorts of constant "mass" downloads during everyday usage, was my first thought. What is mass download? 1 MB? 500 MB ? 5 GB? 500 GB? ...
– Croll
2 days ago
8
8
@Croll If your organisation has one million files, any one person probably doesn't need to access anywhere close to that many in order to do their job (most files won't be related to their work). If somebody starts trying to download all one million over a day or two, that's suspicious. Even a small percentage of that one million could be suspicious. 1% of one million is 10,000 files. How many people working for your organisation need to access 10,000 files over the span of 48 hours to do their job? Very few (if any).
– Anthony Grist
2 days ago
@Croll If your organisation has one million files, any one person probably doesn't need to access anywhere close to that many in order to do their job (most files won't be related to their work). If somebody starts trying to download all one million over a day or two, that's suspicious. Even a small percentage of that one million could be suspicious. 1% of one million is 10,000 files. How many people working for your organisation need to access 10,000 files over the span of 48 hours to do their job? Very few (if any).
– Anthony Grist
2 days ago
|
show 6 more comments
5 Answers
5
active
oldest
votes
up vote
121
down vote
A backup operator will have the permission and behavioral markers of someone that moves lots of data around. Like any sysadmin where there's no dedicated backup operator in place.
Snowden was a sysadmin. He would knew all the protection protocols in place. He could just impersonate anyone, from any area, download things, impersonate the next one, and keep doing that.
If it's common knowledge that there's no bulletproof protection against a dedicated attacker, imagine a trusted internal dedicated attacker with sysadmin privileges.
149
TL;dr: you can't protect yourself against yourself.
– Braiam
2 days ago
1
Comments are not for extended discussion; this conversation has been moved to chat.
– Jeff Ferland♦
yesterday
add a comment |
up vote
16
down vote
Anomaly detection systems like Beehive make it easier than before to dig through lots of data and detect suspicious behavior. This means that it is possible for an analyst to focus on the more relevant data, process more data in shorter time and also use more detailed input data for the analysis. This way the chance is higher than before that somebody can detect unwanted behavior.
It is claimed (and I have no reason to doubt this claim) in the Beehive paper that the system can detect more incidents than the usually used systems - but it is not claimed that the system can detect every incident or even how much of all incidents it could detect. Thus, it might be that other systems only detect 10% of all incidents and Beehive detects 20%, which is good but not really satisfactory.
Could such a system detect somebody like Snowden? This depends a lot on how much and what kind and what detail of data is collected for analysis, how strict the existing security policies are in the first place so that policy violations can be logged and how much the illegal (as seen by the NSA) activities of Snowden differed from his usual work activity. The more it differs the more likely it can be detected by anomaly detection system. But the more similar illegal and legal activities are in terms of the logged data, the less likely is that illegal activities will be reported as anomaly.
In other words: it could help to detect some Snowden type actions but it will not detect all Snowden type actions. And preventing such actions would be even more difficult, more likely is a more early detection after some harm was already done and thus limiting the impact.
2
And the false positives... Wow, imagine you got promoted to a System Admin position and then suddenly you have federal agents show up at your door...
– Nelson
yesterday
5
@Nelson Federal agents will be at your door long before that if you're in the running for a sysadmin position. Get ready for looooads of profiling and interviews.
– Lightness Races in Orbit
yesterday
add a comment |
up vote
12
down vote
Snowden's intent was data exfiltration and he was also a system admin. So, he had access to large amounts of data normal users didn't and would have a different pattern of how he interacts with the network. If Beehive was in place, it may have logged that he was doing something but anyone who has an intent of data exfiltration would've known how to bypass alerting: make your pattern of data exfiltration "normal" from the time the system started getting trained and it wouldn't be flagged as anomalous activity. Snowden could've had pattern of dumping 16GB a day to a USB thumb drive but as long as he didn't do sudden change in his techniques, Beehive wouldn't have flagged him.
I'm working on some custom ways at work to detect this kind of pattern. But, right now I don't know of anything automated that'll do a good job.
New contributor
add a comment |
up vote
8
down vote
No it can't.
And the quote that you pulled clearly explained why not, and how people came to claim that it could.
What Beehive might be able to do is tell you that a Snowden-style attack has taken place. (even thoguh goin by @ThoriumBR a SNOWDEN would not have been prevented)
What you (or that guy) claims is that it could PREVENT such an attack, which is far, far different.
Beehive is crawling logs and (maybe, didn't read too much) combining that with some advanced analysis.
Which means that even if your analysis-and-flagging system is running in real-time it would probably be too late.
[Just imagine where Beehive comes in:
Suspicious action -> security program -> log -> beehive extracts data -> beehive analysis -> flag thrown -> intervention?
This is far too late (and it assumes that the logs are evaluated in real-time]
Logs are for retroactive investigation, not real-time intervention.
What you could do is produce a pseudo-log for any action, have that analysed by Beehive and only upon being greenlit the action is executed.
The enormous overhead and noticeable delay would make that approach a really hard sell to any manager though. [also, not using logs but build in evaluating-mechanisms in your platform would be far better]
New contributor
6
And the false positives. Job promotions will be a nightmare, as will department changes.
– Nelson
yesterday
As a sysadmin, could one simple alter the logs?
– paulj
yesterday
@paulj Not if the logs are sent to a remote server or forward-sealed, but that only applies to logs that were already generated. A sysadmin could, of course, forge any subsequent logs.
– forest
14 hours ago
Incidentally (and unrelatedly), modern file systems do have pseudo-logs, which are finalized much more quickly than something like Beehive could match
– jpaugh
4 hours ago
add a comment |
up vote
1
down vote
First of all, there is a very important distinction between being able to detect a "Snowden-like" actor and being able to prevent one. As far as I have seen, Beehive makes no claims about preventing one, but rather seems to promise the ability to give you alerts that suspicious activity is happening in your network. Sure, not as good, but still considered a "holy grail" in some research communities.
With that said, I'm extremely doubtful that Beehive is able to meet those expectations. Machine learning can do quite well at extracting complex patterns from large piles of data with reliable identities. For example, differentiating between pictures of cats and dogs is extremely reliable; we can all do it 99+% of the time, yet if I had to tell what's the exact algorithm for taking in 100x100 pixels and determining cat vs dog, I have no idea how I would do that. But I can supply you with 100,000 of such images and let ML methods figure out a rule that reliably differentiates between the two based on the values of 100x100 pixels. If I do things right, the rules created by ML should even work on new images of cats and dogs, assuming no huge changes in the new data (i.e., if I only used labs and tabby cats in the training data, then try to get it to identify a terrier...good luck). That's ML's strength.
Determining "suspicious behavior" is a much more difficult issue. We don't have 100,000's of samples of confirmed bad behavior, and we don't even really have 100,000's of samples of confirmed good behavior! Worse yet, what was a good ML method that worked yesterday doesn't work today; unlike cats and dogs in photos, adversaries try really hard to trick you. Most people I know working on ML for cyber security have accepted that the idea of purely automated detection is beyond our grasp at the moment, but perhaps we can build tools to automate very specific repetitive tasks that a security analyst needs to do over and over, thus making them more efficient.
With that said, the authors of Beehive seem to have skipped that lesson and claim that they've solved this problem. I'm highly suspicious of the performance, especially given that the methods they suggest are the first one a ML researcher might think to try and have routinely been rejected as not useful. For example, they suggest using PCA to identify outliers in logs. This, and variations of it, has been tried 100s of times and the result is always that the security analyst shuts off the "automated detection" because they get so many false positives that it costs way more time than it saves.
Of course, in all these methods, the devil is the details and the details of these types of methods never really get exposed in published work ("we used PCA to look for outliers in server logs" is an extremely vague statement). It's always possible that they have some super clever way of preprocessing the data before applying their methods that didn't make it into the paper. But I'd be willing to bet my right arm that no user of Beehive will be able to reliably differentiate between "Snowden-like" behavior and normal real world use of a network in real time.
New contributor
add a comment |
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
121
down vote
A backup operator will have the permission and behavioral markers of someone that moves lots of data around. Like any sysadmin where there's no dedicated backup operator in place.
Snowden was a sysadmin. He would knew all the protection protocols in place. He could just impersonate anyone, from any area, download things, impersonate the next one, and keep doing that.
If it's common knowledge that there's no bulletproof protection against a dedicated attacker, imagine a trusted internal dedicated attacker with sysadmin privileges.
149
TL;dr: you can't protect yourself against yourself.
– Braiam
2 days ago
1
Comments are not for extended discussion; this conversation has been moved to chat.
– Jeff Ferland♦
yesterday
add a comment |
up vote
121
down vote
A backup operator will have the permission and behavioral markers of someone that moves lots of data around. Like any sysadmin where there's no dedicated backup operator in place.
Snowden was a sysadmin. He would knew all the protection protocols in place. He could just impersonate anyone, from any area, download things, impersonate the next one, and keep doing that.
If it's common knowledge that there's no bulletproof protection against a dedicated attacker, imagine a trusted internal dedicated attacker with sysadmin privileges.
149
TL;dr: you can't protect yourself against yourself.
– Braiam
2 days ago
1
Comments are not for extended discussion; this conversation has been moved to chat.
– Jeff Ferland♦
yesterday
add a comment |
up vote
121
down vote
up vote
121
down vote
A backup operator will have the permission and behavioral markers of someone that moves lots of data around. Like any sysadmin where there's no dedicated backup operator in place.
Snowden was a sysadmin. He would knew all the protection protocols in place. He could just impersonate anyone, from any area, download things, impersonate the next one, and keep doing that.
If it's common knowledge that there's no bulletproof protection against a dedicated attacker, imagine a trusted internal dedicated attacker with sysadmin privileges.
A backup operator will have the permission and behavioral markers of someone that moves lots of data around. Like any sysadmin where there's no dedicated backup operator in place.
Snowden was a sysadmin. He would knew all the protection protocols in place. He could just impersonate anyone, from any area, download things, impersonate the next one, and keep doing that.
If it's common knowledge that there's no bulletproof protection against a dedicated attacker, imagine a trusted internal dedicated attacker with sysadmin privileges.
answered 2 days ago
ThoriumBR
19.9k54868
19.9k54868
149
TL;dr: you can't protect yourself against yourself.
– Braiam
2 days ago
1
Comments are not for extended discussion; this conversation has been moved to chat.
– Jeff Ferland♦
yesterday
add a comment |
149
TL;dr: you can't protect yourself against yourself.
– Braiam
2 days ago
1
Comments are not for extended discussion; this conversation has been moved to chat.
– Jeff Ferland♦
yesterday
149
149
TL;dr: you can't protect yourself against yourself.
– Braiam
2 days ago
TL;dr: you can't protect yourself against yourself.
– Braiam
2 days ago
1
1
Comments are not for extended discussion; this conversation has been moved to chat.
– Jeff Ferland♦
yesterday
Comments are not for extended discussion; this conversation has been moved to chat.
– Jeff Ferland♦
yesterday
add a comment |
up vote
16
down vote
Anomaly detection systems like Beehive make it easier than before to dig through lots of data and detect suspicious behavior. This means that it is possible for an analyst to focus on the more relevant data, process more data in shorter time and also use more detailed input data for the analysis. This way the chance is higher than before that somebody can detect unwanted behavior.
It is claimed (and I have no reason to doubt this claim) in the Beehive paper that the system can detect more incidents than the usually used systems - but it is not claimed that the system can detect every incident or even how much of all incidents it could detect. Thus, it might be that other systems only detect 10% of all incidents and Beehive detects 20%, which is good but not really satisfactory.
Could such a system detect somebody like Snowden? This depends a lot on how much and what kind and what detail of data is collected for analysis, how strict the existing security policies are in the first place so that policy violations can be logged and how much the illegal (as seen by the NSA) activities of Snowden differed from his usual work activity. The more it differs the more likely it can be detected by anomaly detection system. But the more similar illegal and legal activities are in terms of the logged data, the less likely is that illegal activities will be reported as anomaly.
In other words: it could help to detect some Snowden type actions but it will not detect all Snowden type actions. And preventing such actions would be even more difficult, more likely is a more early detection after some harm was already done and thus limiting the impact.
2
And the false positives... Wow, imagine you got promoted to a System Admin position and then suddenly you have federal agents show up at your door...
– Nelson
yesterday
5
@Nelson Federal agents will be at your door long before that if you're in the running for a sysadmin position. Get ready for looooads of profiling and interviews.
– Lightness Races in Orbit
yesterday
add a comment |
up vote
16
down vote
Anomaly detection systems like Beehive make it easier than before to dig through lots of data and detect suspicious behavior. This means that it is possible for an analyst to focus on the more relevant data, process more data in shorter time and also use more detailed input data for the analysis. This way the chance is higher than before that somebody can detect unwanted behavior.
It is claimed (and I have no reason to doubt this claim) in the Beehive paper that the system can detect more incidents than the usually used systems - but it is not claimed that the system can detect every incident or even how much of all incidents it could detect. Thus, it might be that other systems only detect 10% of all incidents and Beehive detects 20%, which is good but not really satisfactory.
Could such a system detect somebody like Snowden? This depends a lot on how much and what kind and what detail of data is collected for analysis, how strict the existing security policies are in the first place so that policy violations can be logged and how much the illegal (as seen by the NSA) activities of Snowden differed from his usual work activity. The more it differs the more likely it can be detected by anomaly detection system. But the more similar illegal and legal activities are in terms of the logged data, the less likely is that illegal activities will be reported as anomaly.
In other words: it could help to detect some Snowden type actions but it will not detect all Snowden type actions. And preventing such actions would be even more difficult, more likely is a more early detection after some harm was already done and thus limiting the impact.
2
And the false positives... Wow, imagine you got promoted to a System Admin position and then suddenly you have federal agents show up at your door...
– Nelson
yesterday
5
@Nelson Federal agents will be at your door long before that if you're in the running for a sysadmin position. Get ready for looooads of profiling and interviews.
– Lightness Races in Orbit
yesterday
add a comment |
up vote
16
down vote
up vote
16
down vote
Anomaly detection systems like Beehive make it easier than before to dig through lots of data and detect suspicious behavior. This means that it is possible for an analyst to focus on the more relevant data, process more data in shorter time and also use more detailed input data for the analysis. This way the chance is higher than before that somebody can detect unwanted behavior.
It is claimed (and I have no reason to doubt this claim) in the Beehive paper that the system can detect more incidents than the usually used systems - but it is not claimed that the system can detect every incident or even how much of all incidents it could detect. Thus, it might be that other systems only detect 10% of all incidents and Beehive detects 20%, which is good but not really satisfactory.
Could such a system detect somebody like Snowden? This depends a lot on how much and what kind and what detail of data is collected for analysis, how strict the existing security policies are in the first place so that policy violations can be logged and how much the illegal (as seen by the NSA) activities of Snowden differed from his usual work activity. The more it differs the more likely it can be detected by anomaly detection system. But the more similar illegal and legal activities are in terms of the logged data, the less likely is that illegal activities will be reported as anomaly.
In other words: it could help to detect some Snowden type actions but it will not detect all Snowden type actions. And preventing such actions would be even more difficult, more likely is a more early detection after some harm was already done and thus limiting the impact.
Anomaly detection systems like Beehive make it easier than before to dig through lots of data and detect suspicious behavior. This means that it is possible for an analyst to focus on the more relevant data, process more data in shorter time and also use more detailed input data for the analysis. This way the chance is higher than before that somebody can detect unwanted behavior.
It is claimed (and I have no reason to doubt this claim) in the Beehive paper that the system can detect more incidents than the usually used systems - but it is not claimed that the system can detect every incident or even how much of all incidents it could detect. Thus, it might be that other systems only detect 10% of all incidents and Beehive detects 20%, which is good but not really satisfactory.
Could such a system detect somebody like Snowden? This depends a lot on how much and what kind and what detail of data is collected for analysis, how strict the existing security policies are in the first place so that policy violations can be logged and how much the illegal (as seen by the NSA) activities of Snowden differed from his usual work activity. The more it differs the more likely it can be detected by anomaly detection system. But the more similar illegal and legal activities are in terms of the logged data, the less likely is that illegal activities will be reported as anomaly.
In other words: it could help to detect some Snowden type actions but it will not detect all Snowden type actions. And preventing such actions would be even more difficult, more likely is a more early detection after some harm was already done and thus limiting the impact.
edited 2 days ago
answered 2 days ago
Steffen Ullrich
110k12191256
110k12191256
2
And the false positives... Wow, imagine you got promoted to a System Admin position and then suddenly you have federal agents show up at your door...
– Nelson
yesterday
5
@Nelson Federal agents will be at your door long before that if you're in the running for a sysadmin position. Get ready for looooads of profiling and interviews.
– Lightness Races in Orbit
yesterday
add a comment |
2
And the false positives... Wow, imagine you got promoted to a System Admin position and then suddenly you have federal agents show up at your door...
– Nelson
yesterday
5
@Nelson Federal agents will be at your door long before that if you're in the running for a sysadmin position. Get ready for looooads of profiling and interviews.
– Lightness Races in Orbit
yesterday
2
2
And the false positives... Wow, imagine you got promoted to a System Admin position and then suddenly you have federal agents show up at your door...
– Nelson
yesterday
And the false positives... Wow, imagine you got promoted to a System Admin position and then suddenly you have federal agents show up at your door...
– Nelson
yesterday
5
5
@Nelson Federal agents will be at your door long before that if you're in the running for a sysadmin position. Get ready for looooads of profiling and interviews.
– Lightness Races in Orbit
yesterday
@Nelson Federal agents will be at your door long before that if you're in the running for a sysadmin position. Get ready for looooads of profiling and interviews.
– Lightness Races in Orbit
yesterday
add a comment |
up vote
12
down vote
Snowden's intent was data exfiltration and he was also a system admin. So, he had access to large amounts of data normal users didn't and would have a different pattern of how he interacts with the network. If Beehive was in place, it may have logged that he was doing something but anyone who has an intent of data exfiltration would've known how to bypass alerting: make your pattern of data exfiltration "normal" from the time the system started getting trained and it wouldn't be flagged as anomalous activity. Snowden could've had pattern of dumping 16GB a day to a USB thumb drive but as long as he didn't do sudden change in his techniques, Beehive wouldn't have flagged him.
I'm working on some custom ways at work to detect this kind of pattern. But, right now I don't know of anything automated that'll do a good job.
New contributor
add a comment |
up vote
12
down vote
Snowden's intent was data exfiltration and he was also a system admin. So, he had access to large amounts of data normal users didn't and would have a different pattern of how he interacts with the network. If Beehive was in place, it may have logged that he was doing something but anyone who has an intent of data exfiltration would've known how to bypass alerting: make your pattern of data exfiltration "normal" from the time the system started getting trained and it wouldn't be flagged as anomalous activity. Snowden could've had pattern of dumping 16GB a day to a USB thumb drive but as long as he didn't do sudden change in his techniques, Beehive wouldn't have flagged him.
I'm working on some custom ways at work to detect this kind of pattern. But, right now I don't know of anything automated that'll do a good job.
New contributor
add a comment |
up vote
12
down vote
up vote
12
down vote
Snowden's intent was data exfiltration and he was also a system admin. So, he had access to large amounts of data normal users didn't and would have a different pattern of how he interacts with the network. If Beehive was in place, it may have logged that he was doing something but anyone who has an intent of data exfiltration would've known how to bypass alerting: make your pattern of data exfiltration "normal" from the time the system started getting trained and it wouldn't be flagged as anomalous activity. Snowden could've had pattern of dumping 16GB a day to a USB thumb drive but as long as he didn't do sudden change in his techniques, Beehive wouldn't have flagged him.
I'm working on some custom ways at work to detect this kind of pattern. But, right now I don't know of anything automated that'll do a good job.
New contributor
Snowden's intent was data exfiltration and he was also a system admin. So, he had access to large amounts of data normal users didn't and would have a different pattern of how he interacts with the network. If Beehive was in place, it may have logged that he was doing something but anyone who has an intent of data exfiltration would've known how to bypass alerting: make your pattern of data exfiltration "normal" from the time the system started getting trained and it wouldn't be flagged as anomalous activity. Snowden could've had pattern of dumping 16GB a day to a USB thumb drive but as long as he didn't do sudden change in his techniques, Beehive wouldn't have flagged him.
I'm working on some custom ways at work to detect this kind of pattern. But, right now I don't know of anything automated that'll do a good job.
New contributor
New contributor
answered 2 days ago
RG1
1312
1312
New contributor
New contributor
add a comment |
add a comment |
up vote
8
down vote
No it can't.
And the quote that you pulled clearly explained why not, and how people came to claim that it could.
What Beehive might be able to do is tell you that a Snowden-style attack has taken place. (even thoguh goin by @ThoriumBR a SNOWDEN would not have been prevented)
What you (or that guy) claims is that it could PREVENT such an attack, which is far, far different.
Beehive is crawling logs and (maybe, didn't read too much) combining that with some advanced analysis.
Which means that even if your analysis-and-flagging system is running in real-time it would probably be too late.
[Just imagine where Beehive comes in:
Suspicious action -> security program -> log -> beehive extracts data -> beehive analysis -> flag thrown -> intervention?
This is far too late (and it assumes that the logs are evaluated in real-time]
Logs are for retroactive investigation, not real-time intervention.
What you could do is produce a pseudo-log for any action, have that analysed by Beehive and only upon being greenlit the action is executed.
The enormous overhead and noticeable delay would make that approach a really hard sell to any manager though. [also, not using logs but build in evaluating-mechanisms in your platform would be far better]
New contributor
6
And the false positives. Job promotions will be a nightmare, as will department changes.
– Nelson
yesterday
As a sysadmin, could one simple alter the logs?
– paulj
yesterday
@paulj Not if the logs are sent to a remote server or forward-sealed, but that only applies to logs that were already generated. A sysadmin could, of course, forge any subsequent logs.
– forest
14 hours ago
Incidentally (and unrelatedly), modern file systems do have pseudo-logs, which are finalized much more quickly than something like Beehive could match
– jpaugh
4 hours ago
add a comment |
up vote
8
down vote
No it can't.
And the quote that you pulled clearly explained why not, and how people came to claim that it could.
What Beehive might be able to do is tell you that a Snowden-style attack has taken place. (even thoguh goin by @ThoriumBR a SNOWDEN would not have been prevented)
What you (or that guy) claims is that it could PREVENT such an attack, which is far, far different.
Beehive is crawling logs and (maybe, didn't read too much) combining that with some advanced analysis.
Which means that even if your analysis-and-flagging system is running in real-time it would probably be too late.
[Just imagine where Beehive comes in:
Suspicious action -> security program -> log -> beehive extracts data -> beehive analysis -> flag thrown -> intervention?
This is far too late (and it assumes that the logs are evaluated in real-time]
Logs are for retroactive investigation, not real-time intervention.
What you could do is produce a pseudo-log for any action, have that analysed by Beehive and only upon being greenlit the action is executed.
The enormous overhead and noticeable delay would make that approach a really hard sell to any manager though. [also, not using logs but build in evaluating-mechanisms in your platform would be far better]
New contributor
6
And the false positives. Job promotions will be a nightmare, as will department changes.
– Nelson
yesterday
As a sysadmin, could one simple alter the logs?
– paulj
yesterday
@paulj Not if the logs are sent to a remote server or forward-sealed, but that only applies to logs that were already generated. A sysadmin could, of course, forge any subsequent logs.
– forest
14 hours ago
Incidentally (and unrelatedly), modern file systems do have pseudo-logs, which are finalized much more quickly than something like Beehive could match
– jpaugh
4 hours ago
add a comment |
up vote
8
down vote
up vote
8
down vote
No it can't.
And the quote that you pulled clearly explained why not, and how people came to claim that it could.
What Beehive might be able to do is tell you that a Snowden-style attack has taken place. (even thoguh goin by @ThoriumBR a SNOWDEN would not have been prevented)
What you (or that guy) claims is that it could PREVENT such an attack, which is far, far different.
Beehive is crawling logs and (maybe, didn't read too much) combining that with some advanced analysis.
Which means that even if your analysis-and-flagging system is running in real-time it would probably be too late.
[Just imagine where Beehive comes in:
Suspicious action -> security program -> log -> beehive extracts data -> beehive analysis -> flag thrown -> intervention?
This is far too late (and it assumes that the logs are evaluated in real-time]
Logs are for retroactive investigation, not real-time intervention.
What you could do is produce a pseudo-log for any action, have that analysed by Beehive and only upon being greenlit the action is executed.
The enormous overhead and noticeable delay would make that approach a really hard sell to any manager though. [also, not using logs but build in evaluating-mechanisms in your platform would be far better]
New contributor
No it can't.
And the quote that you pulled clearly explained why not, and how people came to claim that it could.
What Beehive might be able to do is tell you that a Snowden-style attack has taken place. (even thoguh goin by @ThoriumBR a SNOWDEN would not have been prevented)
What you (or that guy) claims is that it could PREVENT such an attack, which is far, far different.
Beehive is crawling logs and (maybe, didn't read too much) combining that with some advanced analysis.
Which means that even if your analysis-and-flagging system is running in real-time it would probably be too late.
[Just imagine where Beehive comes in:
Suspicious action -> security program -> log -> beehive extracts data -> beehive analysis -> flag thrown -> intervention?
This is far too late (and it assumes that the logs are evaluated in real-time]
Logs are for retroactive investigation, not real-time intervention.
What you could do is produce a pseudo-log for any action, have that analysed by Beehive and only upon being greenlit the action is executed.
The enormous overhead and noticeable delay would make that approach a really hard sell to any manager though. [also, not using logs but build in evaluating-mechanisms in your platform would be far better]
New contributor
New contributor
answered 2 days ago
Hobbamok
1813
1813
New contributor
New contributor
6
And the false positives. Job promotions will be a nightmare, as will department changes.
– Nelson
yesterday
As a sysadmin, could one simple alter the logs?
– paulj
yesterday
@paulj Not if the logs are sent to a remote server or forward-sealed, but that only applies to logs that were already generated. A sysadmin could, of course, forge any subsequent logs.
– forest
14 hours ago
Incidentally (and unrelatedly), modern file systems do have pseudo-logs, which are finalized much more quickly than something like Beehive could match
– jpaugh
4 hours ago
add a comment |
6
And the false positives. Job promotions will be a nightmare, as will department changes.
– Nelson
yesterday
As a sysadmin, could one simple alter the logs?
– paulj
yesterday
@paulj Not if the logs are sent to a remote server or forward-sealed, but that only applies to logs that were already generated. A sysadmin could, of course, forge any subsequent logs.
– forest
14 hours ago
Incidentally (and unrelatedly), modern file systems do have pseudo-logs, which are finalized much more quickly than something like Beehive could match
– jpaugh
4 hours ago
6
6
And the false positives. Job promotions will be a nightmare, as will department changes.
– Nelson
yesterday
And the false positives. Job promotions will be a nightmare, as will department changes.
– Nelson
yesterday
As a sysadmin, could one simple alter the logs?
– paulj
yesterday
As a sysadmin, could one simple alter the logs?
– paulj
yesterday
@paulj Not if the logs are sent to a remote server or forward-sealed, but that only applies to logs that were already generated. A sysadmin could, of course, forge any subsequent logs.
– forest
14 hours ago
@paulj Not if the logs are sent to a remote server or forward-sealed, but that only applies to logs that were already generated. A sysadmin could, of course, forge any subsequent logs.
– forest
14 hours ago
Incidentally (and unrelatedly), modern file systems do have pseudo-logs, which are finalized much more quickly than something like Beehive could match
– jpaugh
4 hours ago
Incidentally (and unrelatedly), modern file systems do have pseudo-logs, which are finalized much more quickly than something like Beehive could match
– jpaugh
4 hours ago
add a comment |
up vote
1
down vote
First of all, there is a very important distinction between being able to detect a "Snowden-like" actor and being able to prevent one. As far as I have seen, Beehive makes no claims about preventing one, but rather seems to promise the ability to give you alerts that suspicious activity is happening in your network. Sure, not as good, but still considered a "holy grail" in some research communities.
With that said, I'm extremely doubtful that Beehive is able to meet those expectations. Machine learning can do quite well at extracting complex patterns from large piles of data with reliable identities. For example, differentiating between pictures of cats and dogs is extremely reliable; we can all do it 99+% of the time, yet if I had to tell what's the exact algorithm for taking in 100x100 pixels and determining cat vs dog, I have no idea how I would do that. But I can supply you with 100,000 of such images and let ML methods figure out a rule that reliably differentiates between the two based on the values of 100x100 pixels. If I do things right, the rules created by ML should even work on new images of cats and dogs, assuming no huge changes in the new data (i.e., if I only used labs and tabby cats in the training data, then try to get it to identify a terrier...good luck). That's ML's strength.
Determining "suspicious behavior" is a much more difficult issue. We don't have 100,000's of samples of confirmed bad behavior, and we don't even really have 100,000's of samples of confirmed good behavior! Worse yet, what was a good ML method that worked yesterday doesn't work today; unlike cats and dogs in photos, adversaries try really hard to trick you. Most people I know working on ML for cyber security have accepted that the idea of purely automated detection is beyond our grasp at the moment, but perhaps we can build tools to automate very specific repetitive tasks that a security analyst needs to do over and over, thus making them more efficient.
With that said, the authors of Beehive seem to have skipped that lesson and claim that they've solved this problem. I'm highly suspicious of the performance, especially given that the methods they suggest are the first one a ML researcher might think to try and have routinely been rejected as not useful. For example, they suggest using PCA to identify outliers in logs. This, and variations of it, has been tried 100s of times and the result is always that the security analyst shuts off the "automated detection" because they get so many false positives that it costs way more time than it saves.
Of course, in all these methods, the devil is the details and the details of these types of methods never really get exposed in published work ("we used PCA to look for outliers in server logs" is an extremely vague statement). It's always possible that they have some super clever way of preprocessing the data before applying their methods that didn't make it into the paper. But I'd be willing to bet my right arm that no user of Beehive will be able to reliably differentiate between "Snowden-like" behavior and normal real world use of a network in real time.
New contributor
add a comment |
up vote
1
down vote
First of all, there is a very important distinction between being able to detect a "Snowden-like" actor and being able to prevent one. As far as I have seen, Beehive makes no claims about preventing one, but rather seems to promise the ability to give you alerts that suspicious activity is happening in your network. Sure, not as good, but still considered a "holy grail" in some research communities.
With that said, I'm extremely doubtful that Beehive is able to meet those expectations. Machine learning can do quite well at extracting complex patterns from large piles of data with reliable identities. For example, differentiating between pictures of cats and dogs is extremely reliable; we can all do it 99+% of the time, yet if I had to tell what's the exact algorithm for taking in 100x100 pixels and determining cat vs dog, I have no idea how I would do that. But I can supply you with 100,000 of such images and let ML methods figure out a rule that reliably differentiates between the two based on the values of 100x100 pixels. If I do things right, the rules created by ML should even work on new images of cats and dogs, assuming no huge changes in the new data (i.e., if I only used labs and tabby cats in the training data, then try to get it to identify a terrier...good luck). That's ML's strength.
Determining "suspicious behavior" is a much more difficult issue. We don't have 100,000's of samples of confirmed bad behavior, and we don't even really have 100,000's of samples of confirmed good behavior! Worse yet, what was a good ML method that worked yesterday doesn't work today; unlike cats and dogs in photos, adversaries try really hard to trick you. Most people I know working on ML for cyber security have accepted that the idea of purely automated detection is beyond our grasp at the moment, but perhaps we can build tools to automate very specific repetitive tasks that a security analyst needs to do over and over, thus making them more efficient.
With that said, the authors of Beehive seem to have skipped that lesson and claim that they've solved this problem. I'm highly suspicious of the performance, especially given that the methods they suggest are the first one a ML researcher might think to try and have routinely been rejected as not useful. For example, they suggest using PCA to identify outliers in logs. This, and variations of it, has been tried 100s of times and the result is always that the security analyst shuts off the "automated detection" because they get so many false positives that it costs way more time than it saves.
Of course, in all these methods, the devil is the details and the details of these types of methods never really get exposed in published work ("we used PCA to look for outliers in server logs" is an extremely vague statement). It's always possible that they have some super clever way of preprocessing the data before applying their methods that didn't make it into the paper. But I'd be willing to bet my right arm that no user of Beehive will be able to reliably differentiate between "Snowden-like" behavior and normal real world use of a network in real time.
New contributor
add a comment |
up vote
1
down vote
up vote
1
down vote
First of all, there is a very important distinction between being able to detect a "Snowden-like" actor and being able to prevent one. As far as I have seen, Beehive makes no claims about preventing one, but rather seems to promise the ability to give you alerts that suspicious activity is happening in your network. Sure, not as good, but still considered a "holy grail" in some research communities.
With that said, I'm extremely doubtful that Beehive is able to meet those expectations. Machine learning can do quite well at extracting complex patterns from large piles of data with reliable identities. For example, differentiating between pictures of cats and dogs is extremely reliable; we can all do it 99+% of the time, yet if I had to tell what's the exact algorithm for taking in 100x100 pixels and determining cat vs dog, I have no idea how I would do that. But I can supply you with 100,000 of such images and let ML methods figure out a rule that reliably differentiates between the two based on the values of 100x100 pixels. If I do things right, the rules created by ML should even work on new images of cats and dogs, assuming no huge changes in the new data (i.e., if I only used labs and tabby cats in the training data, then try to get it to identify a terrier...good luck). That's ML's strength.
Determining "suspicious behavior" is a much more difficult issue. We don't have 100,000's of samples of confirmed bad behavior, and we don't even really have 100,000's of samples of confirmed good behavior! Worse yet, what was a good ML method that worked yesterday doesn't work today; unlike cats and dogs in photos, adversaries try really hard to trick you. Most people I know working on ML for cyber security have accepted that the idea of purely automated detection is beyond our grasp at the moment, but perhaps we can build tools to automate very specific repetitive tasks that a security analyst needs to do over and over, thus making them more efficient.
With that said, the authors of Beehive seem to have skipped that lesson and claim that they've solved this problem. I'm highly suspicious of the performance, especially given that the methods they suggest are the first one a ML researcher might think to try and have routinely been rejected as not useful. For example, they suggest using PCA to identify outliers in logs. This, and variations of it, has been tried 100s of times and the result is always that the security analyst shuts off the "automated detection" because they get so many false positives that it costs way more time than it saves.
Of course, in all these methods, the devil is the details and the details of these types of methods never really get exposed in published work ("we used PCA to look for outliers in server logs" is an extremely vague statement). It's always possible that they have some super clever way of preprocessing the data before applying their methods that didn't make it into the paper. But I'd be willing to bet my right arm that no user of Beehive will be able to reliably differentiate between "Snowden-like" behavior and normal real world use of a network in real time.
New contributor
First of all, there is a very important distinction between being able to detect a "Snowden-like" actor and being able to prevent one. As far as I have seen, Beehive makes no claims about preventing one, but rather seems to promise the ability to give you alerts that suspicious activity is happening in your network. Sure, not as good, but still considered a "holy grail" in some research communities.
With that said, I'm extremely doubtful that Beehive is able to meet those expectations. Machine learning can do quite well at extracting complex patterns from large piles of data with reliable identities. For example, differentiating between pictures of cats and dogs is extremely reliable; we can all do it 99+% of the time, yet if I had to tell what's the exact algorithm for taking in 100x100 pixels and determining cat vs dog, I have no idea how I would do that. But I can supply you with 100,000 of such images and let ML methods figure out a rule that reliably differentiates between the two based on the values of 100x100 pixels. If I do things right, the rules created by ML should even work on new images of cats and dogs, assuming no huge changes in the new data (i.e., if I only used labs and tabby cats in the training data, then try to get it to identify a terrier...good luck). That's ML's strength.
Determining "suspicious behavior" is a much more difficult issue. We don't have 100,000's of samples of confirmed bad behavior, and we don't even really have 100,000's of samples of confirmed good behavior! Worse yet, what was a good ML method that worked yesterday doesn't work today; unlike cats and dogs in photos, adversaries try really hard to trick you. Most people I know working on ML for cyber security have accepted that the idea of purely automated detection is beyond our grasp at the moment, but perhaps we can build tools to automate very specific repetitive tasks that a security analyst needs to do over and over, thus making them more efficient.
With that said, the authors of Beehive seem to have skipped that lesson and claim that they've solved this problem. I'm highly suspicious of the performance, especially given that the methods they suggest are the first one a ML researcher might think to try and have routinely been rejected as not useful. For example, they suggest using PCA to identify outliers in logs. This, and variations of it, has been tried 100s of times and the result is always that the security analyst shuts off the "automated detection" because they get so many false positives that it costs way more time than it saves.
Of course, in all these methods, the devil is the details and the details of these types of methods never really get exposed in published work ("we used PCA to look for outliers in server logs" is an extremely vague statement). It's always possible that they have some super clever way of preprocessing the data before applying their methods that didn't make it into the paper. But I'd be willing to bet my right arm that no user of Beehive will be able to reliably differentiate between "Snowden-like" behavior and normal real world use of a network in real time.
New contributor
edited 8 mins ago
New contributor
answered 13 mins ago
Cliff AB
1113
1113
New contributor
New contributor
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f197169%2fcan-beehive-detect-a-snowden-like-actor%23new-answer', 'question_page');
}
);
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
36
Simple answer: No, most certainly not. Snowden was someone who had privileged access and had the authority and reason to mass-download content (he was a sysadmin).
– forest
2 days ago
3
But in the training case, they model everybody according to their behavior. So, after the training, a mass download will be a behavioral change that will produce an alert signal.
– kelalaka
2 days ago
6
Unless mass-downloading is 1) not common and 2) it's not possible to just throttle the download.
– forest
2 days ago
13
Why "mass download" is even considered suspicious. there are will be some sorts of constant "mass" downloads during everyday usage, was my first thought. What is mass download? 1 MB? 500 MB ? 5 GB? 500 GB? ...
– Croll
2 days ago
8
@Croll If your organisation has one million files, any one person probably doesn't need to access anywhere close to that many in order to do their job (most files won't be related to their work). If somebody starts trying to download all one million over a day or two, that's suspicious. Even a small percentage of that one million could be suspicious. 1% of one million is 10,000 files. How many people working for your organisation need to access 10,000 files over the span of 48 hours to do their job? Very few (if any).
– Anthony Grist
2 days ago