Article ID: | iaor20123403 |
Volume: | 4 |
Issue: | 2 |
Start Page Number: | 137 |
End Page Number: | 156 |
Publication Date: | Mar 2012 |
Journal: | International Journal of Shipping and Transport Logistics |
Authors: | Wang Jiahui, Pulat P Simin, Shen Guoqiang |
Keywords: | datamining |
The difficulty of integrating data from multiple transportation data sources has been an ongoing challenge for researchers who study the movement of freight across the globe. The difficulty is due to the data sources using different units, scales, data frequencies and commodity codes. This paper proposes a three‐step data mining model to select and integrate data sources for freight transportation applications, with a special focus on global port‐to‐port freight movement between the USA and other countries. The data filtration step is developed to select the relevant data sources from a set of original data sources, and then identify the most efficient subset of the selected relevant data sources. The data integration step implements some specific integration techniques to build a new database for a given freight transportation research. The data interaction step investigates data applications of the newly built database in a variety of application domains. The approach is demonstrated by establishing a database in studying the global containerised freight movement for the USA.