Parsing Outlook emails in Java

Pavel Polívka - Sep 21 '20 - - Dev Community

Recently I was in need to parse Outlook emails to extract some values so that automated tests can pass multifactor authentication. I was hoping for some naïve implementation in JavaScript but could not found reliable solution there so that I search for good library in Java. I was not even surprised that there were several solutions for parsing Outlook msg files. Java truly has library for everything.

I chose the the Auxilii msgparser library. As it seemed like the easiest to use solution.

Added via Maven.

<dependency>
    <groupId>com.auxilii.msgparser</groupId>
    <artifactId>msgparser</artifactId>
    <version>1.1.15</version>
</dependency>

Usage is then straight forward

Message parsedMessage = new MsgParser().parseMsg(msgFile.getInputStream());
String body = parsedMessage.getBodyText();
List<Attachment> attachments = parsedMessage.getAttachments();

Please be aware that Outlook on MacOS does not use msg format for it’s emails. Exported emails on mac are eml. Those are exported in plain text so they could be parsed via regex just be reading the file.

The whole code supporting all would look like this.

String body = "";
if(file.getName().endsWith("msg")) {
    Message parsedMessage = new MsgParser().parseMsg(file);
    body = parsedMessage.getBodyText();
} else if (file.getName().endsWith("eml")) {
    body = new String(Files.readAllBytes(file.toPath()), StandardCharsets.UTF_8);
}
// here parse your body

If this is interesting to you, you can follow me on Twitter.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .