General Functional Discussions for Service

Get Involved. Join the Conversation.

Topic

    Pavol Procka
    Barracuda - what is SPAM USER value based on?Answered
    Topic posted August 14, 2017 by Pavol ProckaSilver Crown: 22,500+ Points, last edited August 14, 2017 
    158 Views, 3 Comments
    Title:
    Barracuda - what is SPAM USER value based on?
    Content:

    Hi,

    our Barracuda SPAM filter is recently adding the #TAG# to a lot of emails coming through.

    I understand this is based on the Tag value in the Barracuda app, but it seems that a lot of the legitimate emails are tagged because they are classified as SPAM USER with values of 4+ (our TAG Score is set to 3.5).

    Can anyone tell me where this value comes from? And how can it be adjusted? The users I checked (including the one from the screenshot) is not included on our blackilsted email addresses in the Barracuda app, so I assume they are getting these values based on a blacklist managed by Barracuda?

    (Even so, we have now test-wise reset the Bayesian Database (keeping a backup) but are not really happy to get rid of 1000s of entries we taught it during the years, also am afraid it will start flooding our normal Qs with SPAM). I would still like to know where the value for SPAM USER comes from.

    Attached screenshot of the header showing the SPAM USER

    Image:

    Best Comment

    Steven House

    Hello Pavol,

    Before I get to your specific question, I think it’s good to quickly cover what Bayesian Analysis is and what ‘The User’ section means. 

    Bayesian Analysis is a linguistic algorithm that profiles language used in both spam messages and legitimate email for any particular user or organization. To determine the likelihood that a new email is spam, Bayesian Analysis compares the words and phrases used in the new email against the corpus of previously identified email. The Barracuda Spam Firewall only uses Bayesian Analysis, after administrators or users profile a corpus of at least 200 legitimate messages and 200 spam messages.

    And to your point, The ‘User’ section of the X-Barracuda-Bayes line indicates that the code is evaluating the message against the per-user Bayesian database.

    Your Issue: This value, as noted before, comes from the content of the message and doesn’t include any known blacklists (Note: If it was a blacklist, that information would be included in on the ‘Barracuda-Spam-Report’ breakdown).  It’s safe to assume that you may have received messages like this in the past that was defined as ‘Spam’ in your Quarantine Inbox. Still, no system is perfect, and I believe this is a great opportunity to possibly look to update your ‘Tag’ settings. I’m a personal fan of, if you believe in your team’s ability to asses/define spam, is to increase the ‘Tag’ value up (Possibly 5?). If you’re not comfortable with that, and this issue is causing your team a lot of problems, then it may be best for your team to temporarily disable the tag feature in your account and decrease your Quarantine score. This way, when a message is caught by your Quarantine Inbox, you can define similar messages as ‘not span’ to train your Bayesian to avoid these false positives.  

    What we’ll be unable to do, unfortunately, is specifically identify the source of this issue. That’s because this score is completely unique to your system, and we don’t have access to your past decisions. 

    If you still have questions with your barracuda instance, then please feel free to submit a support request and our team will be happy to take a look.

    I hope this update helps and I look forward to any feedback/concerns you have with this update.

    Best,

    Steven House | Senior Deliverability Specialist
    Oracle Cloud Operations

    Comment

     

    • Steven House

      Hello Pavol,

      Before I get to your specific question, I think it’s good to quickly cover what Bayesian Analysis is and what ‘The User’ section means. 

      Bayesian Analysis is a linguistic algorithm that profiles language used in both spam messages and legitimate email for any particular user or organization. To determine the likelihood that a new email is spam, Bayesian Analysis compares the words and phrases used in the new email against the corpus of previously identified email. The Barracuda Spam Firewall only uses Bayesian Analysis, after administrators or users profile a corpus of at least 200 legitimate messages and 200 spam messages.

      And to your point, The ‘User’ section of the X-Barracuda-Bayes line indicates that the code is evaluating the message against the per-user Bayesian database.

      Your Issue: This value, as noted before, comes from the content of the message and doesn’t include any known blacklists (Note: If it was a blacklist, that information would be included in on the ‘Barracuda-Spam-Report’ breakdown).  It’s safe to assume that you may have received messages like this in the past that was defined as ‘Spam’ in your Quarantine Inbox. Still, no system is perfect, and I believe this is a great opportunity to possibly look to update your ‘Tag’ settings. I’m a personal fan of, if you believe in your team’s ability to asses/define spam, is to increase the ‘Tag’ value up (Possibly 5?). If you’re not comfortable with that, and this issue is causing your team a lot of problems, then it may be best for your team to temporarily disable the tag feature in your account and decrease your Quarantine score. This way, when a message is caught by your Quarantine Inbox, you can define similar messages as ‘not span’ to train your Bayesian to avoid these false positives.  

      What we’ll be unable to do, unfortunately, is specifically identify the source of this issue. That’s because this score is completely unique to your system, and we don’t have access to your past decisions. 

      If you still have questions with your barracuda instance, then please feel free to submit a support request and our team will be happy to take a look.

      I hope this update helps and I look forward to any feedback/concerns you have with this update.

      Best,

      Steven House | Senior Deliverability Specialist
      Oracle Cloud Operations

    • Pavol Procka

      Hi Steven,

       

      thanks for the explanation, now I know where that score comes from. Based on that, we have decided to give it a go and reset our Bayesian Database (as it is anyway suggested in a few KB articles I found) and will try to teach it the "good practices" only (It had close to 90 000 messages in the SPAM already - as such I can imagine that a lot of text would be tagged based on these).

      We do keep a backup in case it causes more problems than good. I understand I will need to add 200 of each type before it starts kicking in again. I will keep an eye on the development.

      We did change the TAG value to 5 previously, but rather as a temporary measure to help the support manage the TAGed emails - we prefer to keep the TAG score on 3.5 as it is set again now.

      Many thanks

      Pavol

    • Pavol Procka

      Jut to update here, we are still struggling with tuning the spam filtering.

      Also, the information about 200 messages needed in both SPAM and NON-SPAM categories does not seem to be correct as after we reset the Bayesian database, we still were getting emails scored on the scores given per X-Barracuda-Bayes line, even before we repopulated the lists to 200+. This was a daily occurrence for some 3 weeks (the time we needed to repopulate the NON-SPAM category with 200 examples).