Soamsnake Baruwa - Fuzzy-cleanmysql

Discussion in 'HOWTO-Related Questions' started by itsnedkeren, Feb 27, 2011.

  1. itsnedkeren

    itsnedkeren New Member

    Spamsnake Baruwa - Fuzzy-cleanmysql

    I believe Rocky posted in another thread, that he would look at this script as it never completes and uses 100% CPU.

    Any news on this Rocky?

    Code:
    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    6135 root      20   0  7584 3668 1728 R 99.9  0.2 674:10.94 fuzzy-cleanmysq

    Thanks
     
    Last edited: Feb 27, 2011
  2. Rocky

    Rocky Member

    Jim,

    Please enable verbose in the script by setting:
    my $verbose = 0;
    to
    my $verbose = 1;
    Run the script and post your logs.

    I haven't been able to replicate this problem, as all is well on my end. What version of mysql are you running?
     
  3. nousa

    nousa New Member

    Hi Rocky I have the same issue as itsnedkeren, the fuzzy-cleanmysql just don't end and take 100% of the CPU resources.
    My version of mysql is 5.1.49-1ubuntu8.1
    When I change the $verbose to 1 nothing happens it's the same as with the 0 (it keep running), and I cannot find in any place anything related to the fuzzy-cleanmysql error log, where should I look at the verbose logs (on the screen it doesn't show anything).
    Cheers,
    nousa
     
    Last edited: Mar 1, 2011
  4. itsnedkeren

    itsnedkeren New Member

    I've done as above, nothing changed. Still outputs nothing and runs forever using 100% cpu.
     
  5. Rocky

    Rocky Member

    Maybe it was a script error caused by windows. Copy the below into a new fuzzy-cleanmysql script, change the username/password to match your system and make it executable. Run it and let me know if it still fails.

    HTML:
    #!/usr/bin/perl
    #Script to clean out mysql tables of data. Default is to leave data in Safe for 1 day and Hash for 10 days.
    #Fuzzyocr-cleanmysql
    
    use Getopt::Long;
    use DBI;
    use MLDBM qw(DB_File Storable);
    my %Files = (
        db_hash => '/var/lib/fuzzyocr/FuzzyOcr.db',
        db_safe => '/var/lib/fuzzyocr/FuzzyOcr.safe.db',
        );
    
    use DBI;
    $database = "FuzzyOcr";
    $hostname = "localhost";
    $socket = "/var/run/mysqld/mysqld.sock";
    $port = "3306";
    $username = "fuzzyocr";
    $password = 'password';
    
    # defaults
    my $cfgfile = "/etc/spamassassin/FuzzyOcr.cf";
    my %App;
    
    my %age;
    $age{'age'} = 10*24;  # 10 days
    $age{'hash'} = $age{'age'};
    $age{'safe'} = 0;
    my $help = 0;
    my $verbose = 0;
    GetOptions( \%age,
        'age=i',
        'config=s' => \$cfgfile,
        'hash=i',
        'help' => \$help,
        'safe=i',
        'verbose' => \$verbose,
    );
    
    if ($help) {
        print "Usage: fuzzy-cleanmysql [Options]\n";
        print "\n";
        print "Available options:\n";
        print "--age=i      Global age in hours to keep in db\n";
        print "--config=s   Specify location of FuzzyOcr.cf\n";
        print "             Default: /etc/spamassassin/FuzzyOcr.cf\n";
        print "--hash=i     Number of hours old to keep in Hash db\n";
        print "--safe=i     Number of hours old to keep in Safe db\n";
        print "--verbose    Show more informations\n";
        print "\n";
        exit 1;
    }
    
    # Convert hours to seconds
    $age{'age'} *= 60 * 60;
    $age{'hash'} *= 60 * 60;
    $age{'safe'} *= 60 * 60;
    $age{'safe'} = $age{'safe'} ? $age{'safe'} : $age{'age'};
    
    # Read custom paths from FuzzyOcr.cf
    my $app_path = q(/usr/local/netpbm/bin:/usr/local/bin:/usr/bin);
    open CONFIG, "< $cfgfile" or warn "Can't read configuration file, using defaults...\n";
    
    while (<CONFIG>) {
        chomp;
        if ($_ =~ m/^focr_bin_(\w+) (.+)/) {
            $App{$1} = $2;
            printf "Found custom path \"$2\" for application \"$1\"\n" if $verbose;
        }
        if ($_ =~ m/^focr_path_bin (.+)/) {
            $app_path = $1;
            printf "Found new path: \"$1\"\n" if $verbose;
        }
        if ($_ =~ m/^focr_enable_image_hashing (\d)/) {
            $App{hashing_type} = $1;
            printf "Found DB Hashing\n" if ($verbose and $1 == 2);
            printf "Found MySQL Hashing\n" if ($verbose and $1 == 3);
        }
        if ($_ =~ m/^focr_mysql_(\w+) (.+)/) {
            $MySQL{$1} = $2;
            printf "Found MySQL option $1 => '$2'\n" if $verbose;
        }
        if ($_ =~ m/^focr_threshold_max_hash (.+)/) {
            $App{max_hash} = $1;
            printf "Updated Thresold{max_hash} = $1\n" if $verbose;
        }
    }
    
    close CONFIG;
    
    # make shure we have this threshold set
    $App{max_hash} = 5 unless defined $App{max_hash};
    
    # search path for bin_util unless already specified in configuration file
    foreach my $app (@bin_utils) {
        next if defined $App{$app};
        foreach my $d (split(':',$app_path)) {
            if (-x "$d/$app") {
                $App{$app} = "$d/$app";
                last;
            }
        }
    }
    
    sub get_ddb {
        my %dopts = ( AutoCommit => 1 );
        my $dsn = "DBI:mysql:database=$database";
        if (defined $socket) {
            $dsn .= ";mysql_socket=$socket";
        } else {
            $dsn .= ";host=$hostname";
            $dns .= ";port=$port" unless $port == 3306;
        }
        printf "Connecting to: $dsn\n" if $verbose;
        return DBI->connect($dsn, $username, $password,\%dopts) or die("Could not connect!");
    }
    
    if ($App{hashing_type} == 3) {
     my $ddb = get_ddb();
      if ($ddb) {
        my $sql;
        foreach my $ff (sort keys %Files) {
          $ff =~ s/db_//;
          $sqlbase = "FROM $MySQL{$ff} WHERE $MySQL{$ff}.\`check\` < ?";
          my $timestamp = time;
          $timestamp = $timestamp - $age{$ff};
          $sql = "DELETE $sqlbase";
          if ( $verbose ) {
            printf "Delete from Table $MySQL{$ff}\n";
            print "$sql,  $timestamp\n";
            print "Timestamp is ", scalar(localtime($timestamp)), "\n";
            print "That's $age{$ff} seconds earlier than now.\n";
            print "\n";
          }
          $ddb->do($sql,undef,$timestamp);
        }
        $ddb->disconnect;
      }
    }
    
     
  6. nousa

    nousa New Member

    Rocky! you rock!!!!
    The script works perfectly now and it take maybe haf a second to finish....
    However I will still keep an eye and if within 1 week I will not post here, then that will mean that your new script fixed the issue and you can update it in your main how-to content.
    Thanks,
    nousa
     
  7. Rocky

    Rocky Member

    Great news, I must of copied it off my word doc, so it messed up the script. Glad to know it's working now.
     
  8. itsnedkeren

    itsnedkeren New Member

    Excellent Rocky, all working now :) Thanks a lot!


    Code:
    root@mailgw:~# /usr/sbin/fuzzy-cleanmysql
    Found custom path "pnmnorm, pnminvert,  ppmtopgm" for application "helper"
    Found custom path "tesseract" for application "helper"
    Found MySQL Hashing
    Found MySQL option db => 'FuzzyOcr'
    Found MySQL option hash => 'Hash'
    Found MySQL option safe => 'Safe'
    Found MySQL option user => 'fuzzyocr'
    Found MySQL option pass => '***************'
    Found MySQL option host => 'localhost'
    Found MySQL option port => '3306'
    Found MySQL option socket => '/var/run/mysqld/mysqld.sock'
    Connecting to: DBI:mysql:database=FuzzyOcr;mysql_socket=/var/run/mysqld/mysqld.sock
    Delete from Table Hash
    DELETE FROM Hash WHERE Hash.`check` < ?,  1298217534
    Timestamp is Sun Feb 20 16:58:54 2011
    That's 864000 seconds earlier than now.
    
    Delete from Table Safe
    DELETE FROM Safe WHERE Safe.`check` < ?,  1298217534
    Timestamp is Sun Feb 20 16:58:54 2011
    That's 864000 seconds earlier than now.
    
     
  9. Rocky

    Rocky Member

    Great, guide updated as well.
     

Share This Page