zmtrainsa --cleanup not fully cleaning up

Discuss your pilot or production implementation with other Zimbra admins or our engineers.
Post Reply
ewilen
Elite member
Elite member
Posts: 1429
Joined: Fri Sep 12, 2014 11:34 pm

zmtrainsa --cleanup not fully cleaning up

Post by ewilen »

I've noticed that some mails are getting left in the spam account for extended times. E.g. I just ran a test by doing

/opt/zimbra/bin/zmtrainsa >> /opt/zimbra/log/spamtrain.log 2>&1

/opt/zimbra/bin/zmtrainsa --cleanup >> /opt/zimbra/log/spamtrain.log 2>&1
Then I looked in spamtrain.log and saw:


[QUOTE]20090730122011 Starting spam/ham extraction from system accounts.

[] INFO: Total messages processed: 42

[] INFO: Total messages processed: 4

20090730122017 Finished extracting spam/ham from system accounts.

20090730122017 Starting spamassassin training.

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 22 message(s) (42 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 4 message(s) (4 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

bayes: synced databases from journal in 1 seconds: 2982 unique entries (2982 total entries)

20090730122025 Finished spamassassin training.

20090730122036 Starting spam/ham cleanup

[] INFO: Total messages processed: 30

[] INFO: Total messages processed: 4

20090730122042 Finished spam/ham cleanup[/quote]As you can see, 12 fewer messages were processed by cleanup than by extraction. I confirmed this by looking in the spam account's inbox. Then I ran the two commands again and looked at the log:[quote]

20090730122241 Starting spam/ham extraction from system accounts.

[] INFO: Total messages processed: 12

[] INFO: Total messages processed: 0

20090730122245 Finished extracting spam/ham from system accounts.

20090730122245 Starting spamassassin training.

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 0 message(s) (12 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 0 message(s) (0 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

bayes: synced databases from journal in 0 seconds: 340 unique entries (539 total entries)

20090730122249 Finished spamassassin training.

20090730122254 Starting spam/ham cleanup

[] INFO: Total messages processed: 12

[] INFO: Total messages processed: 0

20090730122259 Finished spam/ham cleanup

[/QUOTE]

So you can see the final 12 messages were extracted/processed AGAIN (potentially overweighting their contents?) then cleaned up.
Any idea why this is happening? The only clue I can suggest is that the cleanup might be missing messages marked as "read" (due to my browsing the spam account) but I don't think it is actually following that pattern.
dgsjeh
Posts: 3
Joined: Sat Sep 13, 2014 2:37 am

zmtrainsa --cleanup not fully cleaning up

Post by dgsjeh »

I know this thread is nearly 4 years old, but I have the same problem in my spamtrain.log:
[QUOTE]Learned tokens from 292 message(s) (673 message(s) examined)

Learned tokens from 0 message(s) (0 message(s) examined)

20120522230055 Finished spamassassin training.

20120522234502 Starting spam/ham cleanup

[] INFO: Total messages processed: 354

[] INFO: Total messages processed: 0

20120522234510 Finished spam/ham cleanup[/QUOTE]
Why did it only delete 354 of 673 messages? Does anyone have any more information about zmtrainsa --cleanup? Very curious as to why it doesn't delete all messages from the spam account.
ewilen
Elite member
Elite member
Posts: 1429
Joined: Fri Sep 12, 2014 11:34 pm

zmtrainsa --cleanup not fully cleaning up

Post by ewilen »

I don't think I've had the problem for a long time. Partly in response to my observation, I started being careful to "mu" (mark unread) any message I had read in the spam/ham accounts. Ultimately I just set the option in both accounts to not mark messages as read when displayed in the preview pane. I've also turned off "Double-click opens message in new window".
dgsjeh
Posts: 3
Joined: Sat Sep 13, 2014 2:37 am

zmtrainsa --cleanup not fully cleaning up

Post by dgsjeh »

I've double checked to make sure all the messages were marked as "Unread". I don't believe that was the issue.
I manually ran zmtrainsa and zmtrainsa --cleanup over and over until all the messages were gone from the account. It's almost as if the cleanup can only process so many messages at once.
[QUOTE]20120523133152 Starting spam/ham extraction from system accounts.

[] INFO: Total messages processed: 446

[] INFO: Total messages processed: 1

20120523133202 Finished extracting spam/ham from system accounts.

20120523133202 Starting spamassassin training.

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 114 message(s) (440 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 1 message(s) (1 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

20120523133224 Finished spamassassin training.

20120523133235 Starting spam/ham cleanup

[] INFO: Total messages processed: 236

[] INFO: Total messages processed: 1

20120523133244 Finished spam/ham cleanup

20120523133320 Starting spam/ham extraction from system accounts.

[] INFO: Total messages processed: 210

[] INFO: Total messages processed: 0

20120523133327 Finished extracting spam/ham from system accounts.

20120523133327 Starting spamassassin training.

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 0 message(s) (209 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 0 message(s) (0 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

bayes: synced databases from journal in 0 seconds: 322 unique entries (381 total entries)

20120523133334 Finished spamassassin training.

20120523133351 Starting spam/ham cleanup

[] INFO: Total messages processed: 120

[] INFO: Total messages processed: 0

20120523133358 Finished spam/ham cleanup

20120523133423 Starting spam/ham extraction from system accounts.

[] INFO: Total messages processed: 90

[] INFO: Total messages processed: 0

20120523133430 Finished extracting spam/ham from system accounts.

20120523133430 Starting spamassassin training.

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 0 message(s) (89 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 0 message(s) (0 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

bayes: synced databases from journal in 0 seconds: 348 unique entries (350 total entries)

20120523133435 Finished spamassassin training.

20120523133442 Starting spam/ham cleanup

[] INFO: Total messages processed: 60

[] INFO: Total messages processed: 0

20120523133449 Finished spam/ham cleanup

20120523133500 Starting spam/ham extraction from system accounts.

[] INFO: Total messages processed: 30

[] INFO: Total messages processed: 0

20120523133507 Finished extracting spam/ham from system accounts.

20120523133507 Starting spamassassin training.

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 0 message(s) (29 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

Learned tokens from 0 message(s) (0 message(s) examined)

netset: cannot include 127.0.0.0/8 as it has already been included

bayes: synced databases from journal in 0 seconds: 177 unique entries (209 total entries)

20120523133511 Finished spamassassin training.

20120523133535 Starting spam/ham cleanup

[] INFO: Total messages processed: 30

[] INFO: Total messages processed: 0

20120523133541 Finished spam/ham cleanup[/QUOTE]
A bit of a pain as we just implemented this feature and are regularly seeing 700+ messages in the spam account per day.
ewilen
Elite member
Elite member
Posts: 1429
Joined: Fri Sep 12, 2014 11:34 pm

zmtrainsa --cleanup not fully cleaning up

Post by ewilen »

I agree. The multiples of '30' are interesting. I'd suggesting opening a bugzilla report and posting the bug # here.
dgsjeh
Posts: 3
Joined: Sat Sep 13, 2014 2:37 am

zmtrainsa --cleanup not fully cleaning up

Post by dgsjeh »

If anyone is curious, I have "fixed" this issue by changing the cron schedule to run 4 times a day:
[QUOTE]0 5,11,17,23 * * * /opt/zimbra/bin/zmtrainsa >> /opt/zimbra/log/spamtrain.log 2>&1[/QUOTE]
I also modified the cleanup cron job to run 4 times a day and decreased the amount of time to wait after the spamtrain cron job runs (from 45 mins to 5 mins):
[QUOTE]5 5,11,17,23 * * * /opt/zimbra/bin/zmtrainsa --cleanup >> /opt/zimbra/log/spamtrain.log 2>&1[/QUOTE]
Post Reply