split mbox

最後更新: 2015-12-21

目錄

  1. 方法1: Bash Script
  2. 方法2:csplit
  3. 方法3:mb2md
  4. 方法4:formail

 


方式1: Bash Script

 

#!/bin/bash
IFS=$'\n'
i=0
BASENAME="split.mbox"
while read x; do
    echo "$x" | grep '^From ' && i=$(( i + 1 ))
    echo "$x" >> $BASENAME.$i
done 

 


方式2: csplit

 

csplit

$ mkdir /tmp/test && cp ~/Mail/myCurruptInbox /tmp/test/mbox && cd /tmp/test
$ csplit -n 4 -k mbox '/^From /' '{*}'
$ cat xx* > new_mbox
$ cat xx0* xx1000 > new_mbox1

csplit 它是根據 一行去分切檔案

csplit  [OPTION] ... FILE  PATTERN

  • -k, --keep-files   <--   do not remove output files on errors
  • -n, --digits=DIGITS   <--   幾多位數字
  • {*}    repeat the previous pattern as many times as possible
  • {n}    重覆 n 次

Output: `xx00', `xx01', ...

#! /bin/bash

n=0

for file in `ls /var/mail/working_dir`
do

        echo $file
        cd /var/mail/working_dir

        cat ${file} >> /var/mail/mail
        rm ${file}

        n=$(( $n + 1 ))

        if [ "$n" -gt 2 ]
        then
                break
        fi
done

 


方式3: mb2md

 

DL: wget http://www.dovecot.org/tools/mb2md.pl

chmod 700 mb2md.pl

Usage:

mb2md -h

mb2md -s sourcefile [-d destdir]

  • sourcefile 及 destdir 都要係 Full Path
  • [-d destdir]   If not given, then the destination will be ~/Maildir

Example:

root@mailserver:/var/mail/backup# ./mb2md -s info

當 file 有問題時:

Fatal: Source is not an mbox file or a directory!

 


方式4: FORMAIL mail (re)formatter

 

mbox 內有幾多個 messages

grep -c "^From " <mbox>

69

每 100 messages 放到一個 file

cat mbox_file | formail -100 -s > mbox.1
cat mbox_file | formail +100 -100 -s > mbox.2
cat mbox_file | formail +200 -100 -s > mbox.3
cat mbox_file | formail +200 -52 -s > mbox.4

formail [+skip] [-total] [-s [command [arg ...]]]

  • -s               split
  • +skip         Skip the first skip messages while splitting.
  • -total         Output at most total messages while splitting.

 

拆一整個 mbox:

#/bin/bash

_mbox=office

cat $_mbox | formail -1 -s > $_mbox.0

for (( n=1; n<30; n++ ))
do
        cat $_mbox | formail  +"$n" -1 -s > $_mbox."$n"
done

# 只適用於小量信 !!

建立 run.sh 去分拆 mbox

#!/bin/bash
# number of mail: 344
# 0
cat office | formail -30 -s > office.0
# 1
cat office | formail +30 -30 -s > office.1
# 2
cat office | formail +60 -30 -s > office.2
# 3
cat office | formail +90 -30 -s > office.3
# 4
cat office | formail +120 -30 -s > office.4
# 5
cat office | formail +150 -30 -s > office.5
# 6
cat office | formail +180 -30 -s > office.6
# 7
cat office | formail +210 -30 -s > office.7
# 8
cat office | formail +240 -30 -s > office.8
# 9
cat office | formail +270 -30 -s > office.9
# 10
cat office | formail +300 -30 -s > office.10
# 11
cat office | formail +330 -30 -s > office.11

cut.sh

#!/bin/bash

_mbox=office

_c=`grep -c "^From " $_mbox`

echo "#!/bin/bash"
echo "# number of mail: $_c"

_loop=$(($_c/30+1))

for (( _n=0; _n<$_loop; _n++ ))
do
        echo "# $_n"
        if [ $_n -eq 0 ]
        then
                echo "cat $_mbox | formail -30 -s > $_mbox.$_n"
        else
                echo "cat $_mbox | formail +$(($_n*30)) -30 -s > $_mbox.$_n"
        fi
done

 

fix mailbox

formail -s < input.mbox > output.mbox

 


 

Creative Commons license icon Creative Commons license icon