Applies to: Exchange Server 2007 SP3, Exchange Server
2007 SP2, Exchange Server 2007 SP1, Exchange Server 2007
Topic Last Modified: 2009-09-09
Content conversion is the process of correctly formatting a message for each recipient. The decision to perform content conversion on a message depends on the destination and format of the message that is being processed. Messages that are sent to recipients inside the Microsoft Exchange Server organization don't require any content conversion performed on them. Only messages that are sent to external recipients may require content conversion.
In an Exchange Server 2007 organization, content conversion is handled by the categorizer on a server that has the Hub Transport server role installed. Categorization on each message happens after a newly arrived message is put in the Submission queue. In addition to recipient resolution and routing resolution, content conversion is performed on the message before the message is put in a delivery queue. If a single message contains multiple recipients, content conversion determines the appropriate encoding for each message recipient. On an Edge Transport server, an abbreviated categorization occurs. This does not involve content conversion.
Understanding the Structure of E-mail Messages
To better understand content conversion, you must understand the structure of e-mail messages. The Simple Mail Transfer Protocol (SMTP) is based on plain 7-bit US-ASCII text to compose and send e-mail messages. A standard SMTP message consists of the following elements:
- The message envelope The message
envelope is defined in RFC 2821. The message envelope contains
information that is required to transmit and deliver the message.
Recipients never see the message envelope, because it is generated
by the message transmission process and is not actually part of the
message contents.
- The message contents The message
contents are defined in RFC 2822. The message contents consist
of the following elements:
- The message header The message header
is a collection of header fields. Header fields consist of a field
name, followed by a colon character ( : ), followed by a
field body, and ended by a carriage return line feed (CRLF)
character combination.
A field name must be composed of printable US-ASCII text characters except the colon character ( : ). Specifically, ASCII characters that have values from 33 to 57 inclusive and 59 to 126 inclusive are permitted.
A field body may be composed of any US-ASCII characters, except for the carriage return character (CR) and the line feed character (LF). However, a field body may contain the CRLF character combination when it is used in header folding. Header folding is the separation of a single header field body into multiple lines as described in section 2.2.3 of RFC 2822. Other field body syntax requirements are described in sections 3 and 4 of RFC 2822.
- The message body The message body is a
collection of lines of US-ASCII text characters that appears after
the message header. The message header and the message body are
separated by an empty blank line that ends with the CRLF character
combination. The message body is optional. Any line of text in the
message body must be less than 998 characters. The CR and LF
characters can only appear together to indicate the end of a
line.
- The message header The message header
is a collection of header fields. Header fields consist of a field
name, followed by a colon character ( : ), followed by a
field body, and ended by a carriage return line feed (CRLF)
character combination.
When SMTP messages contain elements that are not plain US-ASCII text, the message must be encoded to preserve those elements. The MIME standard defines a method of encoding non-text content in messages. MIME allows for text in other character sets, non-text attachments, multipart message bodies, and header fields in other character sets. MIME is defined in RFC 2045, RFC 2046, RFC 2047, RFC 2048, and RFC 2077. MIME defines a collection of header fields that specify additional message attributes. The following table describes some important MIME header fields.
Important MIME header fields
Header field name | Default value | Description | ||
---|---|---|---|---|
MIME-Version: |
1.0 |
This header field is the first MIME header field that appears in a MIME-formatted message. This header field appears after the other standard RFC 2822 header fields, but before any other MIME header fields. MIME-aware e-mail clients use this header field to identify a MIME-encoded message. When this header field is absent, MIME-aware e-mail clients identify the message as plain text. |
||
Content-Type: |
text/plain |
This header field identifies the media type of the message content as described in RFC 2046. A media type consists of a type, a subtype, and one or more optional parameters, such as a charset= parameter that defines the MIME character encoding. Types that begin with "x-" are non-standard. Subtypes that begin with "vnd." are vendor-specific. The Internet Assigned Names Authority (IANA) maintains a list of registered media types. For more information, see MIME Media Types.
The multipart media type allows for multiple message parts in the same message by using sections defined by different media types. Some Content-Type: field values include text/plain, text/html, multipart/mixed and multipart/alternative. |
||
Content-Transfer-Encoding: |
7bit |
This header field can describe the following information about a message:
There can be multiple values of the Content-Transfer-Encoding: header field in a MIME message. When the Content-Transfer-Encoding: header field appears in the message header, it applies to the whole body of the message. When the Content-Transfer-Encoding: header field appears in one of the parts of a multipart message, it applies only to that part of the message. When an encoding algorithm is applied to the message body data, the message body data is transformed into plain US-ASCII text. This transformation allows the message to travel through older SMTP messaging servers that only support messages in US-ASCII text. The values of the Content-Transfer-Encoding: header field that indicate an encoding algorithm was used on the message body are as follows:
Typically, you won't see multiple encoding algorithms used in the same message. When no encoding algorithm has been used on the message body, the Content-Transfer-Encoding: header field merely identifies the current condition of the message body data. The following values of the Content-Transfer-Encoding: header field indicate that no encoding algorithms were used on the message body:
The values 7bit, 8bit, and Binary never exist together in the same multipart message. The values are mutually exclusive. The Quoted-printable or Base64 values may appear in a 7bit or 8bit multipart message body, but never in a Binary message body. If a multipart message body contains different parts that are composed of 7bit and 8bit content, the whole message is classified as 8bit. If a multipart message body contains different parts composed of 7bit, 8bit, and Binary content, the whole message is classified as Binary. |
||
Content-Disposition: |
Attachment |
This header field instructs a MIME-enabled e-mail client on how it should display an attached file, and is described in RFC 2183. The values of this field may be Inline or Attachment. When the value of this field is Inline, the attachment is displayed in the message body. When the value of this field is Attachment, the attached file appears as a regular attachment that is separate from the message body. Other parameters are available when the value is Attachment, such as Filename, Creation-date, and Size. |
Exchange 2007 and Outlook Message Formats
The following list describes the basic message formats that are available in Exchange 2007 and Microsoft Outlook:
- Plain text A plain text message uses
only US-ASCII text as described in RFC 2822. The message can't
contain different fonts or other text formatting. The following two
formats can be used for a plain text message:
- The message headers and the message body are composed of
US-ASCII text.
- The message is really MIME-encoded with a Content-Type value of
text/plain, and a Content-Transfer-Encoding value of 7bit for the
text parts of a multipart message. Any message attachments are
encoded by using Quoted-printable or Base64 encoding. By default,
when you compose and send a plain text message in Outlook, the
message is really MIME-encoded with a Content-Type value of
text/plain.
- The message headers and the message body are composed of
US-ASCII text.
- HTML An HTML message supports text
formatting, background images, tables, bullet points, and other
graphical elements. By definition, an HTML-Formatted message must
be MIME-encoded to preserve these formatting elements.
- Rich text format (RTF) RTF supports
text formatting and other graphical elements. RTF is synonymous
with the Transport Neutral Encoding Format (TNEF). TNEF and RTF can
be used interchangeably.
Only Outlook and a few other MAPI e-mail clients understand RTF messages. MAPI is a Microsoft-developed messaging architecture that enables multiple applications to interact with different messaging systems across a variety of hardware platforms. MAPI is built on the Component Object Model (COM) architecture. Outlook uses MAPI to communicate with mailboxes on a computer running Exchange 2007 that has the Mailbox server role installed.
The rich text message format is completely different from the rich text document format that is available in Microsoft Word.
- TNEF TNEF is a Microsoft-specific
format for encapsulating MAPI message properties. A TNEF message
contains a plain text version of the message and an attachment that
packages the original formatted version of the message. Typically,
this attachment is named Winmail.dat. The Winmail.dat attachment
includes the following information:
- The original formatted version of the message, including, for
example, fonts, text sizes, and text colors
- OLE objects, including, for example, embedded pictures or
embedded Microsoft Office documents
- Special Outlook features, including, for example, custom forms,
voting buttons or meeting requests
- Regular message attachments that were in the original
message
- An RFC 2822-compliant message composed of only US-ASCII
text
- A multipart MIME-encoded message that has a Winmail.dat
attachment
- The plain text version of the message is displayed, and the
message contains an attachment named Winmail.dat, Win.dat, or some
other generic name such as Attnnnnn.dat or
Attnnnnn.eml where the nnnnn placeholder represents a
random number.
- The plain text version of the message is displayed. The TNEF
attachment is ignored or removed. The result is a plain text
message.
- Messaging servers that understand TNEF can be configured to
remove TNEF attachments from incoming messages. The result is a
plain text message. Moreover, some e-mail clients such as
Microsoft Outlook Express may not understand TNEF, but
recognize and ignore TNEF attachments. The result is a plain text
message.
TNEF is understood by Exchange Server 5.0 and later versions. TNEF messages are transferred between SMTP messaging servers by using the standard DATA command verb. TNEF is automatically used by Exchange based on the following situations:
- Exchange 2000 Server TNEF is
used for messages that are transferred between Exchange servers
that are in different routing groups.
- Exchange Server 2003 If the
Exchange organization is in mixed mode, TNEF is used for messages
that are transferred between Exchange servers that are in different
routing groups.
- The original formatted version of the message, including, for
example, fonts, text sizes, and text colors
- Summary Transport Neutral Encoding Format
(STNEF) STNEF is equivalent to TNEF. However,
STNEF messages are encoded differently than TNEF messages.
Specifically, STNEF messages are always MIME-encoded and always
have a Content-Transfer-Encoding value of Binary. Therefore, there
is no plain text representation of the message, and there is no
distinct Winmail.dat attachment contained in the body of the
message. The whole message is represented by using only binary
data. Messages that have Content-Transfer-Encoding value of Binary
can only be transferred between SMTP messaging servers that support
and advertise the BINARYMIME and CHUNKING SMTP extensions as
defined in RFC 3030. The messages are always transferred
between SMTP messaging by using the BDAT command, instead of the
standard DATA command.
STNEF is understood by Exchange 2000 and later versions. STNEF is automatically used by Exchange if the following conditions are true:
- Exchange 2000 STNEF is used for
messages that are transferred between Exchange servers that are in
the same routing group. An unsupported hotfix also enables
Exchange 2000 to use STNEF for messages that are transferred
between Exchange servers in different routing groups.
- Exchange 2003 If the Exchange
organization is in native mode, STNEF is used for all messages that
are transferred between Exchange servers in the organization.
- Exchange 2007 STNEF is used for
all messages that are transferred between Exchange servers in the
organization.
- Exchange 2000 STNEF is used for
messages that are transferred between Exchange servers that are in
the same routing group. An unsupported hotfix also enables
Exchange 2000 to use STNEF for messages that are transferred
between Exchange servers in different routing groups.
Elements of Content Conversion
Content conversion is the act of correctly formatting a message for each external recipient. This conversion is performed by the categorizer on a Hub Transport server.
The content conversion options that you can set in an Exchange organization can be described in the following categories:
- TNEF conversion options These
conversion options specify whether TNEF should be preserved or
removed from messages that leave the Exchange organization.
- Message encoding options These options
specify message encoding options, such as MIME and non-MIME
character sets, message encoding, and attachment formats.
These conversion and encoding options are independent of one another. For example, whether TNEF messages can leave the Exchange organization has nothing to do with the MIME encoding settings or plain text encoding settings of those messages.
You can specify the content conversion at various levels of the Exchange organization as described in the following list:
- Remote domain settings Remote domains
define the settings for outgoing message transfers between the
Exchange 2007 organization and domains outside the
Active Directory directory service forest. Even if you don't
create remote domain entries for specific domains, there is a
predefined remote domain named Default that applies to all remote
address spaces ( * ).
- Mail user and mail contact
settings Mail users resemble mail
contacts—both have external e-mail addresses and contain
information about people outside the Exchange organization. The
main difference is mail users have security contexts that can be
used to log on to the Active Directory domain and access
resources to which they have been granted permission.
- Outlook settings Outlook lets you set
the message the formatting and encoding options that are described
in the following list:
- Message format You can set the default
message format for all messages. And you can override the default
message format as you compose a specific message.
- Internet message format You can control
whether TNEF messages are sent to remote recipients or whether they
are first converted to a more compatible format. You can also
specify various message encoding options for messages that are sent
to remote recipients. These settings do not apply for messages sent
to recipients in the Exchange organization.
- Internet recipient message format You
can control whether TNEF messages are sent to specific recipients
or whether they are first converted to a more compatible format.
You can set the conversion options for specific contacts in your
Contacts folder, and you can override the conversion options for a
specific recipient in the To:, Cc: or Bcc: fields as you compose a
message. These conversion options are not available for recipients
in the Exchange organization.
- Internet recipient message encoding
options You can control the MIME or plain text
encoding options for specific contacts in your Contacts folder, and
you can override the conversion options for a specific recipient in
the To:, Cc: or Bcc: fields as you compose a message. These
conversion options are not available for recipients in the Exchange
organization.
- International options You can control
the character sets that are used in messages.
- Message format You can set the default
message format for all messages. And you can override the default
message format as you compose a specific message.
TNEF Conversion Options
You can specify the TNEF conversion options at the following levels:
- Remote domain settings
- Mail user and mail contact settings
- Outlook settings
- Message format
- Internet message format
- Internet recipient message format
- Message format
For detailed information, see TNEF Conversion Options.
Message Encoding Options
You can specify the message options at the following levels:
- Remote domain settings
- Mail user and mail contact settings
- Outlook settings
- Message format
- Internet message
- Internet recipient message format
- Message character set encoding options
- Message format
For detailed information, see Message Encoding Options.
Content Conversion Performed by the Store Driver
The store driver also performs a kind of content conversion. The store driver exists on Hub Transport servers to transport messages between mailboxes on Mailbox servers and the Submission queue. Specifically, the store driver transports messages from the sender's Outbox to the Submission queue on the Hub Transport server, and the store driver transports the messages from the MAPI delivery queue on the Hub Transport server to the recipient's Inbox. The store driver converts all outgoing messages from MAPI and converts all incoming messages to MAPI. Content conversion tracing captures these store driver conversion failures.
For more information, see Managing Content Conversion Tracing.