General Information
EDI software in particular is designed to exchange business data with partners throughout the world. It used to work in conjunction with VANs – Value Added Networks – which ensured that data reached their destination reliably and without being accessed by any unauthorised parties. Nowadays, however, we have the open Internet, along with a whole host of communications protocols to protect your data from nosy competitors.
We simply choose the appropriate protocol to suit our requirements. Sometimes speed of response will be important, while at other times we will be more interested in secure transfers or straightforward processing. Over the following pages, we will introduce you to the standard protocols, offer a quick overview of how they work, and compare their respective pros and cons. Which one you decide to use will then be up to you! Provided that your software can handle that protocol, of course…
Mail
Everybody is familiar with email, so we don’t need to explain what it is here. But we should still take a look at how it works.
Sending emails
When you click “Send” in Thunderbird, Pegasus or Outlook, the program establishes contact with a mail server that you (or your IT department) will have already specified in your software. The protocol used for this is called SMTP – Simple Mail Transfer Protocol. You can find a detailed description of it on Wikipedia. This protocol is designed to send emails from you to your server, and from there on to many other servers across the world until they finally end up in the recipient’s own mail server, which is where they normally remain until the recipient establishes contact with their mail server and empties their electronic mailbox. That brings us on to the next point:
Receiving emails
The POP protocol – Post Office Protocol – has been around for a long time now. It’s currently in its third version, and you will hear the name POP3 come up in many quarters. It is comparatively spartan – you can use it to call up a list of emails currently waiting from the server, to download those emails, and to delete them from the server.
Beyond that, it doesn’t offer a whole lot of functionality – but for most users this is perfectly adequate. Once you’ve downloaded your emails to your local drive, you can sort them into different folders (often automatically using filters in your email software) and manage your own data independently.
However, anyone who needs more than that can use IMAP – the Internet Message Access Protocol. On the one hand, with this method, the emails could theoretically be left on the server forevermore; yet on the other, they come neatly pre-sorted into folders on the server (similar to a directory structure in your file system), and you can even share your emails with other users – e.g. an entire project team. Although IMAP is not much younger than POP3, there are still many mail servers where it hasn’t been implemented yet.
You can find detailed information about the special features of both these protocols on Wikipedia, along with their pros and cons in everyday use. However, we are not primarily interested in emails as they are used in their traditional sense – i.e. for communication between two or more people. We are far more interested in how useful email is for EDI.
E-Mail and EDI
In principle, it’s possible to exchange any data via email – including anything that might be sent across the world under the auspices of EDI. However, this would not be our first choice of communication method. The main reasons for this are:
- Some mail servers apply size restrictions, so large attachments could simply go missing.
- Nowadays, emails are typically no longer acknowledged with a confirmation of receipt, meaning that it is impossible to know whether your data has arrived or not.
- There is no guarantee that an email will arrive at its destination quickly – or at all.
- Emails are completely unsecured against access by third parties, meaning that they can be read or manipulated. Of course, you can use software like PGP to at least encrypt and/ or sign the content, but more secure paths are available.
- You are at the mercy of your email software – for instance, there are mail servers out there that will take it upon themselves to modify the encoding of texts in attachments.
- Virus scanners and other security measures can delete or block individual attachments, or even entire emails.
These shortcomings of standard email have led to the development of AS1 (Applicability Statement 1), a standard for securely transferring business data via email. However, this never really caught on among users, so we can safely ignore it.
Nowadays, business data are still sent by email – even in EDI, though we should really call this “half-EDI”, since in most cases there will be a human at one end of the transfer. For example, invoices might be automatically issued in PDF format and sent via email; or employees working for a small customer may still enter orders manually into Excel files that are subsequently emailed to the supplier, where they are then input into the automatic processing system. In true EDI, where data travel from one system to another without any human intervention, email should really have died out by now. Yet for processes where humans are still involved, EDI software should also be capable of sending and receiving data via email as a precautionary measure. And of course, the software should also ideally support all of the standard protocols (SMTP, POP3, IMAP).
FTP
or File Transfer Protocol – is designed for file transfers, as its name suggests. Most people will have already worked with FTP at some point – e.g. in FileZilla or your Internet browser. You will also probably be aware that you can use FTP from the console (e.g. Unix Shell, or DOS Box in Windows). You can find out more about the operating principles of FTP in Wikipedia.
FTP for EDI
Essentially, we can only do one thing with FTP: transfer files. Users can log into an FTP server and either use “get FileName” to download a file (or alternatively “mget FileTemplate” to download multiple files), or they can use “put FileName” to upload a file (or multiple files with “mput FileTemplate”). All well and good. Once that’s done, we end
up with a file stored somewhere – either on a local drive or on the server. So what comes next? Well, the EDI software needs to able to access the file and process it. In the simplest scenario, the workflow might look as follows:
- A system job (a Unix Cron Job or its Windows counterpart) regularly retrieves files from a particular location via FTP, or alternatively a partner sends them at regular intervals to your FTP server. In either case, the data ends up stored in local directories.
- Meanwhile, the EDI software regularly checks the relevant directory for new files.
- If it finds any, it reads and processes them.
And things might work similarly in the opposite direction, too: The result of the processing in the EDI system might be stored somewhere in a directory ready for another system job to send the data elsewhere via FTP, or for your partner to retrieve them from you.
This would be a solution for a very simple EDI system that didn’t have any FTP capabilities itself. Of course, it would be better if everything happened in one place – i.e. if the EDI software could itself fetch or receive files via FTP independently. In this scenario, there are two options for inputting data via FTP:
1 The EDI software establishes its own FTP connection with a remote server, retrieves the data and processes them.
2 A remote client establishes a direct connection with the EDI system (which acts as a FTP server) and hands it the files.
Either way, we skip the intermediate step of storing files locally and then re-inputting them using a scheduled task. Option two even allows us to process the data the moment they are received from the partner. This is a huge advantage when it comes to time-critical information.
Likewise, the return path is also much quicker and simpler if the EDI
software being used provides its own FTP client and can output your data without any delay.
Pros & Cons
Pros:
- FTP transfers data directly to the destination system. That means you know if your data have arrived safely, and at what time. Plus, it’s almost impossible for a file to get lost en route (as can sometimes be the case with email).
- FTP clients exist on virtually every operating system, so all of your partners will be able to communicate with you.
- If you opt for a binary transfer, you can be sure that your data will arrive untampered with.
Cons:
- You can find out whether your data transfer was successful, but you have no way of knowing whether the EDI system is actually processing the data.
- FTP is not necessarily secure against eavesdropping or manipulation, although FTPS (i.e. the use of SSL during FTP transfers) provides some help with this.
- Firewalls and NATting often interfere with data port negotiation by preventing one system from instructing another to open a particular port, or by preventing the transfer from being routed through the port.
- Decent firewalls will usually identify which port is being negotiated and allow it to open; however, if SSL encryption is being used, the negotiation will go unheard. This can make it much more difficult to use FTP(S), and require a whole host of additional configurations (e.g. defining port ranges, and so on).
Conclusion
There is also an FTP-based standard called AS3 designed to ensure secure data transfers. This includes features like confirmations of receipt (aka MDNs), but much like AS1 (its email-based counterpart), it never really caught on and it plays an insignificant role in practice.
In short, FTP is much better suited to EDI scenarios than email, and it is used very frequently for these purposes; however, there are more optimal communication methods out there, such as AS2.
HTTP
HTTP – Hypertext Transfer Protocol – is familiar to everybody. Without it, there would be no Internet as we know it today. It was originally used to transport HTML pages and images, but has since been greatly expanded and offers a range of possibilities for transferring data of any kind. HTML forms – including with data upload functionality – have long been standard on the web, and files of almost every type are available for download. You can find out more about the history of HTTP and how it works on Wikipedia.
Additional protocols have also been created based on the HTTP protocol that are customised for various aspects of data transfer. Particularly noteworthy examples include WebDAV and AS2, which both have their own dedicated chapters. Here, however, we will look at the use of the original HTTP protocol for EDI purposes.
Basierend auf dem reinen HTTP-Protokoll sind inzwischen auch weitere Protokolle geschaffen worden, die auf diverse Aspekte der Datenübertragung zugeschnitten sind. Hier sind vor allem WebDAV und AS2 zu nennen. Zu diesen beiden gibt es eigene Artikel. Wir wollen den Einsatz des ursprünglichen HTTP-Protokolls im EDI-Umfeld beleuchten.
Browser-based or automatic
The HTML forms we mentioned above, along with download links on websites, will be the most familiar means of transferring data via HTTP for most users. You simply enter some information, perhaps add a file (or even multiple files) for upload, and then click a button to send everything to the web server. After evaluating the information, the server might then respond not only with a website, but with an entire data file that you can save to your hard drive. There’s probably no need to talk about clicking on download links here.
As an example, you might sign in with a certain international postal service (the one with the brown trucks) in order to access specially created webpages where you can download invoices addressed to your company (for instance). Strictly speaking, this doesn’t count as EDI, since only one side of the transfer is performed by a system that independently receives data, processes them and makes them available. At the other end, there is a human sitting in front of a browser. And yet…that doesn’t have to be the case. Anything that a human can input into a form (especially something that appears in a simple link) can also be entered automatically by a piece of software.
Simple GET Requests
Simple GET requests
If you are capable of clicking a link in order to download or upload a file (something that your browser handles for you anyway), then an EDI system will also be able to obtain data by accessing the same URL: http://lobster.de/wp-content/uploads/2014/06/Lobster_data_Broschüre_web_2014_türkis.pdf
Admittedly, this is a poor example of EDI, since PDF documents are unsuitable for EDI processes.
Indeed, the URL given above only contains the address of a document. If you want to send data, they can easily be inserted into the URL too:
http://www.example.com/formular.cgi?parameter1=value1¶meter2=value2
This method of sending data to the server is called GET. It comes with a few disadvantages. For one thing, it is only possible to insert a limited number of relatively small values into a URL. Some very old web servers only tolerate URLs with a maximum length of 255 characters, and even if that limit doesn’t apply, things quickly get confusing with this method. That makes it effectively impossible to transfer files. And it’s also possible for anybody to see the files, since the entire URL travels openly across the web and can even be stored on various other computers for caching purposes. However, when used in conjunction with authentication mechanisms such as HTTP Basic Authentication, it can still be a perfectly adequate way to retrieve a particular customer number along with a certain pre-defined item of data – e.g. all new orders.
A more complicated option: POST
Strictly speaking, we need to distinguish between three varieties of POST:
- Simple POST: What would appear in the URL when using GET is now sent to the server as the body of the request via a data stream. As an example: parameter1=value1¶meter2= value2&File=Long%20text.
- Raw POST: a great option for sending a single file directly to the server. The body – i.e. the data stream – that goes to the server after the headers contains the file content in its original form. There is no need to code any additional special characters here, as in the example of the simple POST where the space character was replaced by %20 (something that would be especially necessary in a GET, incidentally).
- Multipart/form-data: This could be viewed as a combination of the previous two methods. On the one hand, it is very useful for transferring large files (generally encoded in Base64), just like a raw POST; however, it can also be used to combine multiple parameters, and even multiple files. Much like in an email containing multiple attachments, the body is subdivided into individual parts, each of which contains a parameter. And it makes no difference whether those parameters contain just one short text or an entire binary file. The use of MIME types and suitable content encodings allows each part to be transferred in a way that is appropriate to its contents.
In short, each of these alternatives has its pros and cons, and you should be able to decide which one you plan to use on a case-by-case basis. HTML forms generally use either a simple POST (when there is only a small amount of information to transfer) or multipart/form-data when files need to be transferred too. That’s why we call it “multipart/form-data”.
<form enctype=“multipart/form-data“ method=“post“ …>
Identification
HTTP was originally designed to make information available to the entire world. However, content soon began to appear that was only intended to be accessible to a restricted circle of users. That’s why there are options available to identify the user or system (the “client”) that is accessing a URL. Likewise, sometimes the client will want to make sure that they are accessing the right server rather than one belonging to some phisher or spoofer.
- One easy way of identifying the client to the server is via user names and passwords. In principle, this information can be transferred in one of two different ways, which we will describe in more detail below.
- Conversely, the server can only identify itself to the client by providing a trusted certificate. That happens at the point when HTTPS comes into play – by which we mean the use of SSL encryption in HTTP transfers. No doubt you will have already encountered various dialogue boxes online informing you that the certificate for a given server is untrustworthy, for one reason or another. On a lot of websites, we simply ignore the warning – though hopefully nobody does so when it comes from their bank.
- We can also use the same mechanism to designate the client as an authorised user. To do that, we simply reverse the certificate process, so that the client delivers a certificate to the server for it to review. In scenarios where only a precisely defined group of clients are allowed to gain access, the valid certificate can simply be stored on the server.
The walls have ears
HTTP does not have any built-in encryption. Even HTTP Basic Authentication – the simplest way of “logging in” to a web server – will encode the user name and password in Base64 during transfer, but it won’t encrypt them. As such, if you want to be reasonably sure that no third parties can see your data, you should use an HTTPS connection, which encrypts everything that goes through it in SSL.
WebServices
Web services and HTTP don’t necessarily have to go together, but in practice, the vast majority of web services are accessed with HTTP, since it works particularly well as a request-response protocol. You send your request to the server and instantly receive the result of the query as a response. Theoretically, however, web services could also operate over FTP or even email in cases where response times aren’t an important consideration.
The most commonly used type of web service is the SOAP request, although some people prefer to use other options such as REST or RPC. Once again, you can find a detailed account of all this on Wikipedia. We can at least cover SOAP briefly here, which works as follows:
The client (e.g. an EDI software package) sends an XML document containing a request (HTTP POST) to the server. This request could be an inventory request, a room reservation, or anything else that the server is ready to accept. The XML document consists of a SOAP envelope with some header data, and a SOAP body containing the actual application data for the request.
The server (which can itself be an EDI system) receives the XML, evaluates it, puts the result into another SOP body, wraps it with a SOAP envelope, and sends the whole thing back to the client as a response to the request. You can find a few brief examples of SOAPXML requests of this kind in the article on EAI and EDI.
Pros & Cons
Pros:
- The great thing about HTTP is that it’s a request-response protocol. Unlike email, where you simply fire off your message and cross your fingers that it will arrive, or FTP, which can only confirm that your data have been successfully transferred, HTTP is capable of immediately telling you whether or not your data could be processed. And yes, it can even send back the result of that processing – just like a web service. On top of that, it can do so synchronously – i.e. as a direct response to the incoming request.
- If there is anything that can get through the majority of the world’s firewalls, then it’s HTTP.
- There are countless HTTP servers out there, with the best known being Apache and MS IIS.
- If correctly designed, an HTTP-based interface for downloading and uploading files can be accessed equally by a human working through a browser and by an automated piece of software.
Cons:
- HTTP wasn’t originally designed for exchanging important business data. There are mechanisms to authenticate the client to the server and vice versa, and data can also be encrypted. However, although both of these mechanisms have been individually standardised, their combined use is not. In principle, anyone can pick and choose the techniques that they find personally useful.
- It isn’t possible to sign data in HTTP or its spin-off versions. As such, there is no way to prove that the data you receive are genuine, or that they have come from the purported sender.
These disadvantages are addressed by the AS2 standard. AS2 uses HTTP as a transfer protocol, but specifies a range of mechanisms that dictate how authentication between the partners involved in the communication should take place, how data should be signed and encrypted, and so on. AS2 can only be used between two partners who have agreed
in advance to do so, and who have already exchanged identifiers and certificates with each other. On top of that, AS2 software comes with a kind of quality seal that guarantees that the software complies with the defined standard. If you want to offer users the option of transferring data to you and receiving it from you either manually or fully automatically,
then HTTP is very useful. It would also make sense for the EDI software to provide HTTP interfaces, or to allow for connections with one of the commonly used web servers. If you have high security requirements, however, then it would be better to opt for an EDI system with AS2 integration.
AS2
AS2 steht für Applicability Statement 2. Zu deutsch: Verwendbarkeitserklärung 2. Toller Begriff, oder? Sagt so irgendwie …. gar nix. Aber das kann man ja nachlesen.
Im RFC2026 „The Internet Standards Process“ erfahren wir dazu:
An Applicability Statement specifies how, and under what circumstances, one or more TSs may be applied to support a particular Internet capability. An AS may specify uses for TSs that are not Internet Standards, as discussed in Section 7.
An AS identifies the relevant TSs and the specific way in which they are to be combined, and may also specify particular values or ranges of TS parameters or subfunctions of a TS protocol that must be implemented. An AS also specifies the circumstances in which the use of a particular TS is required, recommended, or elective (see section 3.3).
Und TS wiederum steht für „Technical Specification“. So, jetzt sind wir alle wesentlich schlauer. 😉
Da wird immer auf Juristendeutsch geschimpft…
Kurz zusammengefasst beschreibt also ein Applicability Statement, wie man (meist mehrere) technische Standards nutzt und miteinander kombiniert, um eine bestimmte Funktionalität im Internet zu erreichen.Gehen wir die Sache praktisch an und fragen uns einfach mal:
What is AS2?
AS2 is a transfer method developed specially for the interchange of business data. As such, it places special emphasis on things like security and traceability. Security is achieved through encryption and signatures, while traceability is provided by confirmations of receipt (known as MDNs).
The whole thing is based on HTTP, or HTTPS. This offers the advantage that the response to a transfer can be sent immediately (synchronously) within the same connection. In addition, the HTTP protocol can be applied almost anywhere, and is only blocked by firewalls in rare cases. HTTPS (i.e. SSL or TLS) also offers basic encryption of the entire communication. Unlike basic HTTP, however, AS2 offers only the send direction for data (equivalent to a POST or PUT in HTTP), with no file downloads similar to an HTTP GET. At most, you receive a brief confirmation of receipt in response.
There’s more to it than that, however. As with SSL, further certificates also come into play. The data receive additional levels of encryption and are signed. The S/MIME standard is used for this. On top of that, there is a confirmation of receipt. But let’s not get ahead of ourselves.
Interlude: certificates
Let’s talk a bit more about these certificates first.
In very simple terms, certificates (and we are particularly talking about public key certificates here) work by means of an ingenious mathematical trick known as asynchronous encryption. This process uses two keys in the form of large numbers. Either of these keys can be used to encrypt the data, but the other key is always required in order to decrypt them. In other words, the code used to encrypt the message cannot subsequently be used to decrypt it.
The practical result of this is that one of the codes can be published (this is known as the “public key”) so that anyone who wants to send data to you can use it to encrypt them first. Once they have done so, nobody – not even the sender – can decrypt the data. Only you can do so, using the other code that you have retained for yourself (as you might expect, this is then called the “private key”). In this way, you can ensure that the message can only be read by the intended recipient.
It also works the other way around. If you encrypt something with your private key, it can only be made legible again using your public key. In this way, you can guarantee that the data come from you and only you (provided that your private key hasn’t fallen into
anybody else’s hands…). This reversal is exploited for the purposes of signing the data – only this doesn’t involve encrypting the entire message. Instead, a hash value (a kind of code – the simplest example would be a checksum) is created for the data, and only this value is encrypted using the private key. The recipient then creates the same hash
value for the data (i.e. they run the same algorithm), decrypts the sender’s value, and compares the two figures. If the two values don’t match, that means either the data have been manipulated, or they haven’t come from the purported sender. Either way, the recipient will be suspicious of the message and can choose to reject it.
Certificates contain the public key from a given pair of keys, along with some metadata (whom it belongs to, which server it is valid on, and so on). The certificate can also let you know whether the message has been certified and by whom, so that you can trust it. All this is essential information in order to understand the principle behind AS2.
How AS2 works
Let’s enlist the help of Alice and Bob. Alice wants to send data to Bob via AS2.
During the transfer, the following steps take place:
• Alice creates a hash value for the data and signs it using her private key.
• Alice then encrypts the data (along with the signed hash value) using Bob’s public key.
• Alice establishes an HTTP or HTTPS connection with Bob.
• If she uses HTTPS, the entire transfer will have an extra layer of encryption – but we won’t go into that here.
• The identifiers (serial numbers) of the certificates used are also sent with the message in various special HTTP headers, along with the AS2 identifiers of the sender and recipient.
• A simple HTTP confirmation of receipt can be sent at this point if the official confirmation will only be sent later.
• Bob decrypts the transfer using his private key. The HTTP header specifies which key he needs to use.
• Bob creates a hash value for the encrypted data, decrypts Alice’s hash value using her public key, and compares the two.
• Bob decides whether to send Alice a positive or negative confirmation. He might choose to send a negative one if he suspects that the message has been manipulated, or if the decryption process has gone wrong.
• The confirmation of receipt (Message Disposition Notification or MDN) can be signed using Bob’s private key.
• Bob sends the MDN to Alice either synchronously (as a response to Alice’s HTTP request) or asynchronously (in a new HTTP connection between them, this time established by Bob).
This process is very neatly represented in a diagram in the Wikipedia article on AS2. By the way, if you’re wondering what we need the HTTPS encryption for, given that AS2 handles all that anyway: AS2 encrypts the data, but not the HTTP header. That means anyone can find out who is sending data to whom and with what certificates (as well as various other bits and pieces of information). HTTPS provides additional encryption for these metadata. Aside from that, the use of encryption and signatures is not strictly mandatory in AS2. You can choose to skip all that, if you want. However, if you don’t at least use HTTPS then you might as well just publish your data in the newspaper…
The MDN
The MDN is an important component of AS2, and therefore merits a more detailed description. It contains the message ID for the data transfer in need of confirmation. This ID is provided by the sender (in our case Alice) in an HTTP header.
So in simple terms, the MDN is a way of saying, “Thank you – I have received your message (no. XYZ), managed to decrypt it, and evaluated it as (most likely) authentic.” Or alternatively, it might say that the data were corrupt, an unknown certificate was used, or similar. A negative MDN is known as an NDN.
This information is important for the sender. It’s similar to sending a confirmation of receipt by registered letter – at this point, responsibility for the data is handed over to the recipient. The message confirms that the data have been received cleanly. If something goes wrong after that, the sender is off the hook. In this regard, it makes no difference
whether the MDN is sent immediately as a response to the HTTP request that the data came in with, or a few minutes or even hours later as an independent transfer. In some cases, the HTTP transfer itself is carried out by one system, while the decryption and signature check are completed later by a separate system. That’s why you have the option to use either synchronous or asynchronous MDNs. A synchronous MDN goes out as a direct HTTP response to the HTTP request that the data arrived with. By contrast, an asynchronous MDN is sent back to the sender of the data in a separate HTTP request. However, the recipient can’t simply decide that they want to wait a while before sending their MDN – rather, the sender is the one who specifies to the recipient how they should send the MDN.
And if the sender requires an asynchronous MDN, they will also simultaneously provide the recipient with the URL to which that MDN should be sent. Any AS2 software that meets the standard will be able to handle both of these alternatives.
What you need for AS2
If you want to exchange data with your partners via AS2, you (and your partner) will need at least the following:
• One AS2 identifier per participant. You can either set these up yourself, or each participant can specify their own.
• One certificate per participant – although different certificates can be used for signatures and for encryption, in which case two certificates per participant would be needed. You can also get by without any certificates, but that wouldn’t be in the spirit of AS2.
• The public keys to all the certificates used by your partners.
• And – naturally – a software package capable of handling AS2.
Certificates are self-generated. There is a wide range of software on the market, and a good AS2 software package will be capable of creating certificates. However, your partners may choose to use software that insists on signed certificates (i.e. ones that are marked as trustworthy by Verisign or similar bodies). In that case, the signatures will cost you money.
The certificates are then exchanged in advance. You can opt to simply send them via email, since you only swap the certificates with the public key.
You should also make sure that your software comes with Drummond certification. The Drummond Group defines test cases that all AS2 systems taking part in a given round of testing need to be able to handle. Only software that can carry out all the required tasks error-free with every other participant will receive the Drummond Group seal of quality.
This is a highly intensive (and expensive) process, and in order to pass, the software actually needs to meet more test criteria than are set out in RFC 4130 (the specification for AS2). Just ask the software manufacturer if they can provide you with a Drummond certificate.
Pros & Cons
Pros:
- HTTP gives you instant confirmation that the transfer itself was successful (at least through the HTTP response code).
- On top of that, the MDN confirms that the data were received cleanly, and generally also confirms that there were no problems with the decryption and signature check. This can happen synchronously or asynchronously, depending on your requirements.
- The data are encrypted, which ensures that only the intended recipient can read them.
- The signature then guarantees that neither the content nor the sender can be manipulated.
- HTTPS also adds that little bit of extra security on top.
- All the security mechanisms listed above mean that the transfer can take place cheaply over the open Internet, with no need for either an ISDN connection or a VAN.
Cons:
- Good AS2 software (ideally with Drummond certification) is expensive.
- If one of the partners in the communication insists on signed certificates, that will also cost you money.
In terms of costs, you can quickly earn back the money you spent on your software and on any certificates once you start carrying out a significant number of transfers per month. With ISDN-based processes (such as OFTP) or with VANs (such as X.400), you pay extra for each additional transfer, whereas AS2 communication runs over your standard Internet flat rate. It’s up to you to figure out the exact number of transfers that you consider to be “significant”. However, a lot of businesses are currently migrating from X.400 to AS2 precisely because they can save on continuous transfer costs over the course of many years.
SFTP/SCPSFTP (SSH/Secure File Transfer Protocol) and SCP (Secure Copy Protocol) are two transfer protocols that are both based on SSH. They take advantage of the authentication and encryption possibilities offered by SSH; they typically communicate with an SSH server on the server side; and they generally also communicate directly over port 22. Although it has a similar name, SFTP is only superficially similar to the widely used FTP. Both are used for the purpose of transferring files, but on the technical level they have little in common. However, SFTP still offers more possibilities than the more widespread SCP. The latter is only able to copy files (and recursive directory trees), whereas SFTP can also issue commands to rename and delete files or set up directories (in a similar way to standard
FTP).
Once again, you can find the key facts about both protocols on Wikipedia. What we are interested in here:
Is how they can help us with EDI.
Compared to standard FTP, SFTP/SCP offers two advantages:
• The entire communication is encrypted (though this can also be achieved with FTPS – i.e. SSL-encrypted FTP).
• Only one port needs to be opened in the firewall (usually port 22).
Here’s how the ports work:
With FTP, a new port is negotiated for each individual data transfer (whether a file or just a directory listing). When unencrypted FTP is used, firewalls can listen in on the negotiation and open the correct port, but this isn’t possible with FTPS. For that reason, we need to go to the trouble of restricting the data ports to a specific port range and opening all of them in the firewall. And when all your networking is outsourced to a service provider, that can be a frustrating process.
By contrast, SFTP simply uses port 22 (or an alternative port if the SSH server is configured differently) for both commands and data transfers. This port is simply left open.
SCP enjoys the same advantage, except that there are few things it can’t do that SFTP can. But if you don’t need those things (or you deliberately want to limit other parties to just copying), then it works perfectly well. On the other hand, once the other party has retrieved the data intended for them, they won’t be able to delete those data without the commands from Secure Shell itself – and permitting an external party to execute SSH commands on your own hard drive is not always a good idea.
That brings us on to the disadvantage – at least for one half of the communication. Because an SSH server is used to set up the connection and for the encryption, there is a risk (depending on the software used) that a user who should only have access to SCP or SFTP might also log into a Secure Shell and interfere with things on the server. He might have his own account and restricted permissions, but even so, users still manage to slip through the net from time to time. That can be a real headache for many administrators.
Aside from that, SFTP and SCP naturally come with the same pros and cons as standard FTP (except for the ones that we have already discussed as points of difference). Some relief can be provided by a system with its own protocol integrations. This kind of software can (for example) make an SFTP and SCP server available that doesn’t offer any of the possibilities of a secure shell. It can also admit the transferred data for processing straight away, without any of the intermediate steps set out in the article on FTP.
X.400
X.400 is a kind of email, though a very special one. The protocol has very little in common with the email format used over the Internet, and there aren’t many possible providers either, as is the case with “standard” email. For example, the only X.400 provider in Germany is Deutsche Telekom, which took over from the federal postal service. That’s why X.400 is also known as Telebox400 and BusinessMail X.400, as these are the names under which it used to be offered by the postal service and is now offered by Deutsche Telekom.
As ever, there is an informative Wikipedia article on X.400. Here, we will focus on its practical use in EDI.
As for the differences
- Not just anybody can run their own X.400 server online. This offers a degree of security, since the messages can only travel via known (and hopefully trustworthy) locations.
- Once a message is delivered into your (or somebody else’s) X.400 server, its arrival is confirmed by means of a report.
- In turn, the recipient can confirm to the sender that the message has been received (as used to be the case with regular email until spammers ruined everything).
- This means we can class the network of X.400 servers and clients as a VAN, since as well as enabling transfers, it provides a guarantee that not everybody can listen in on the communication.
- As is typical for VANs, that additional service comes at a price. A fee is payable for the mail box, and on top of that you pay for every connection and message. Even the confirmation of receipt by the recipient costs money, which is why they are only sent in rare cases.
- 400 servers used to be accessed via ISDN, but for some time now it has also been possible to contact them over the Internet, including with SSL encryption. The ISDN variant costs more money due to the connection fees.
- You can use the FileWork software distributed by Deutsche Telekom or another alternative as your mail client. The protocol should be integrated into any EDI system, since X.400 is very commonly used in EDI.
- There is also a gateway between X.400 messages and online emails, which converts the messages in both directions so that users of standard email can communicate with X.400 customers – however, this also costs money.
X.400 or AS2
You will probably have noticed how many times the words “this costs money” appear in the description above. This is one of the biggest disadvantages of X.400. Its advantage over other transfer methods is its relative security (ISDN or SSL encryption, only trusted servers and participants, etc.), as well as the option of receiving receipt confirmations. However, most other protocols also offer security, and the cost of receipt confirmations means that many users don’t bother to send them.
If you’ve read the article on AS2, you will no doubt be thinking of MDNs – the receipt confirmations used in AS2. These form part of the standard, and their use is compulsory. On top of that, the actual data transfer in AS2 is free of charge (aside from your Internet flat rate). That’s why so many organisations are currently moving away from X.400 and towards AS2. The Wikipedia article compares these two protocols directly (latest version October 9, 2013). Among the disadvantages of AS2, it lists the effort involved in maintaining certificates and the necessity of keeping a dedicated EDI system running reliably. That said, maintaining the certificates isn’t all that challenging with the right software, and in the era of server farms with virtual machines and so on, it’s also possible to provide an EDI system reasonably reliably. Even the X.400 servers run by Deutsche Telekom are often down, or are blocked by extreme numbers of requests – and when things do go wrong, there’s nothing you can do to intervene. With AS2, by contrast, there is no real disadvantage if you can’t reach the other party’s HTTP server and have to try again later.
And as for security: if you rely on sending your data unencrypted over the X.400 network, you may be placing more trust in the server operators than is really wise. By contrast, a fully encrypted and signed AS2 message sent via HTTPS will take at least quite a bit of effort to hack into. On the other hand, as we have already seen, some “services” are fairly
deeply rooted in the servers of various different IT companies. Whether you decide to use X.400 or AS2 (or something else altogether) will naturally depend on your data volumes, as well as what your communication partners are able to
support. If X.400 meets your needs well, you may hesitate to switch to AS2 simply as a means of reducing costs, and vice versa.
You might even need both of them. As such, you should make sure that your EDI software can handle both options (along with as many other protocols as possible) before you buy it.
OFTP
On a technical level, the ODETTE File Transfer Protocol (OFTP) has very little in common with the widely known FTP. It is used for transferring files, and that’s about it. One of its advantages around the time it was developed was that it could resume interrupted transfers. This is impossible with X.400 (for example) – and because OFTP is often used to exchange very large files (such as CAD data), it wastes an awful lot of time – and sometimes money too – to have to restart a transfer from the very beginning when it fails just before completion. OFTP also features confirmations of receipt, similarly to X.400 or AS2.
Wikipedia offers only a very brief account of this protocol, so we have fleshed things out a little below.
Rough overview
OFTP dates back to the very earliest days of the Internet, when ISDN was just about the most modern feature that was generally available. As such, OFTP was run over ISDN lines, and that is still largely the case today – although the protocol itself is actually independent of the transport layer, and the old OFTP version 1 can also be used via TCP/IP, and thus over the Internet. However, the ISDN route is the standard one, which makes sense, since a direct connection is harder to eavesdrop on than a transfer over the open Internet. OFTP1 still lacks any encryption specifications of its own, unlike the newer version 2, where the use of TLS (or SSL) forms part of the standard. That’s why OFTP2 is often run over the Internet.
As we have already mentioned, OFTP features a receipt confirmation called an end-to-end response (or EERP for short). This confirms to the sender that the file has actually arrived with the recipient in a similar way to AS2 or X.400. As with AS2’s MDN, the EERP in OFTP can either be delivered straight away as part of the current transfer, or over a separate connection later on. Unlike X.400, however, it is generally actually used in practice, since there is no extra cost involved if the communication is configured in a sensible way.
The use of direct ISDN connections between partners in OFTP1 or SSL encryption in OFTP2 makes the transfer secure, while the EERP also allows users to trace whether their file has reached its recipient.
However, the biggest shortcoming of OFTP1 is that it takes place almost exclusively over ISDN. For one thing, this means that each new connection costs money (which can quickly add up to several thousand euros per month in some companies), and for another, it means you need enough free ISDN channels to ensure you don’t constantly hear an engaged tone whenever you want to transfer any data. That in turn requires the right hardware. All this explains why OFTP1 is currently being gradually taken out of service. In particular, the major car manufacturers (OFTP originates from the automotive sector) are currently replacing their old, ISDN-based OFTP1 with OFTP2. In principle, they could just as easily switch to AS2, which offers a little more security (see the AS2 article) – but sometimes people prefer to stick to what they know. Besides, they might need to introduce a new version of their OFTP software (or another software package altogether) that is still compatible with OFTP1, since not every partner will switch to OFTP2 at the same time. If they opt for something entirely new instead (in order to switch to AS2, for example) then they might not be able to handle OFTP at all anymore. Of course, there are EDI systems out there that can handle both alternatives, and much more besides.
So if you expect that you will need to use OFTP, you should definitely look for a product
A little more detail
In theory, OFTP is strictly a send protocol. You open a connection to your partner, identify yourself with your user name (in this case an Odette ID issued by Odette) and password, check the user name and password of your counterpart, and send data across to the other side. You can’t call up files (as with a GET in FTP). However, what you can do is reverse the direction of the data exchange. After depositing your files, you simply send the “change direction” command, which sends a message along the lines of: “I’m done. If you have something for me then send it over”. This mechanism allows the EERPs of previously transferred files to be delivered to the sender, and it also enables two-way communication.
The recipient can now send over their own files. This role reversal can theoretically be repeated as often as needed, until one side is assigned the right to send data but has nothing left to send. When that happens, the party in question simply closes the connection.
In the introductory article on communication paths, we have already mentioned that OFTP can’t actually be used to retrieve data from elsewhere, unlike FTP. However, you can probably now guess how a request of that kind might be achieved: You simply establish a connection and submit a “change direction” command, in the assumption that there might be some files for you. If the other party has something to send, they can now do so.
Another interesting point is that a data transfer of this kind can even involve more than two Odette IDs. This is because OFTP doesn’t simply distinguish between senders and recipients; rather, OFTP looks at who is currently exchanging data (the session IDs) and from whom and to whom the data should be going (the file IDs). In other words:
Participant ABC opens an OFTP connection with participant XYZ. The two parties swap IDs and passwords and establish who is talking to whom. ABC now sends a file, but specifies that they are doing so on behalf of DEF, and that the file is actually intended for recipient UVW. Then comes the next file, which might be from GHI and intended for RST.
Confusing, right? But all this makes perfect sense, since it allows company head offices or service providers to send data via OFTP on behalf of their branches or customers in such a way that everyone knows who the actual sender and recipient are.
Without getting too bogged down in the details of the protocol level, we should quickly mention one special feature of OFTP here:
Virtual files
Each file to be sent is packaged in something called a virtual file, which contains some additional information. The file name is made up of:
- The original file name.
- The date and time (accurate to the second) of the transfer.
- A counter for finer resolution.
The virtual file also contains one other piece of important information about the contents of the (data) file – it specifies whether one of the following structure types is being used:
- F = Fixed Length: Each record has a precisely defined uniform length (which is also specified here). That is equivalent to the basic fixed record format, as described in the chapter on that topic.
- V = Variable Length: The records within the file have different lengths. This is also covered in the section referred to above.
- U = Unstructured: Data of any kind can be transported here, including binary data (as with ENGDAT).
- T = Text: The data consist entirely of ASCII characters. There are no control characters aside from line breaks (CRLF), and each line is no more than 2048 characters long.
OFTP in practice
As we have already said, OFTP is particularly widespread in the automotive industry; however, the protocol is used outside that sector too. If you have to exchange data with business partners who insist on OFTP, you won’t be able to avoid using it yourself.
But you won’t incur any significant running costs if you can use the newer OFTP2 instead of running OFTP1 via ISDN. You will need a software package that can handle OFTP2, an Internet connection, an Odette ID and a certificate. The last of these must be issued by Odette itself, or alternatively signed by Odette or by a trust centre approved by the
institute. These four items cost money, and you will also occasionally have to pay a small fee to extend the certificate, but these charges are negligible compared to the ISDN costs involved in the old method – provided that you intend to make a significant number of transfers.
If you are forced to make do with the old OFTP1, you will need suitable ISDN hardware. Depending on the field of application, that might be an ISDN card in your computer, or a dedicated drive that is accessed over the network via TCP/IP (but still communicates with the outside world over ISDN). Before you buy anything, you should ask the manufacturer of your preferred OFTP/EDI software for recommendations.
But whatever you do, don’t buy any software that can only handle OFTP1 just because it’s cheaper!
As we have said, OFTP1 is in the process of becoming obsolete, so you will definitely be buying a discontinued model and will have to get your wallet out again later on when your communication partners switch to version 2.
By the way, if you come up against the ENGDAT procedure then you will definitely need to be ready for OFTP too. Officially, the two are completely separate – yet OFTP has become the default protocol for data transfers within the ENGDAT process. However, both OFTP versions are used in ENGDAT, so you still have a free choice there.
WebDAV
WebDAV (Web-based Distributed Authoring and Versioning) is an extension of HTTP that takes it back to its roots, in a certain sense. Tim Berners-Lee’s original idea wasn’t for the Internet to be a medium for selling shoes, ringtones and other trinkets or for watching films; rather, he intended it for serious work. Websites shouldn’t be made available to the world for the purposes of passive consumption; instead, they should be actively edited on a joint and cooperative basis. We still find something of that spirit in the form of wikis.
When multiple people jointly edit a file online, they still need the basic options of downloading the file from the server and then re-uploading it in order to save it; however, they also need mechanisms to block other users from accessing the file during editing (to avoid different users overwriting each other’s changes), and ideally version control too, in order to restore previous versions. All this is offered by WebDAV, along with a few other operations for moving, copying and deleting files and working with directories.
WebDAV and EDI
All these nice editing and versioning options are of little relevance to EDI. We are only interested in how our EDI system receives data and sends them back out again. In other words, it’s all about transporting data. With this in mind, when we look at the methods offered by WebDAV (on top of those offered by HTTP), we soon establish that the ones that are relevant to EDI are actually already available in the FTP protocol. What’s more, FTP even offers a few more options. Indeed, even HTTP itself provides us with everything we need in the form of GET and POST.
So what is the benefit of WebDAV? Well, its advantage over FTP is that HTTP can get through almost any firewall, and even when using SSL encryption, we avoid the typical problems we encounter when using FTPS (see the chapter on FTP). And compared to simple HTTP, it has the advantage that we can work with (and modify) a directory structure in a
similar way to FTP. Otherwise, it should make little difference whether you dock a WebDAV server to your HTTP server (there are plenty of options here) or a simple CGI script that receives data via POST (downloads via GET should work regardless). Oh yeah, and user rights are another bonus – but FTP has those too.
In the end, the use of WebDAV for EDI only makes sense if one of your partners provides you with data in this way. For that reason, your EDI software should be able to handle the WebDAV protocol too, just in case.