GSoC 2009 ODT/SXW Import plug-in for Scribus by atmb4u

=Abstract=

Developed as part of OpenOffice.org 2, now part of an ISO standard supported by Google Docs, the OpenDocument Text (ODT) format (also given the file extension SXW) is the most widely used open word-processing file format.ODT/SXW files are the most widely used open word processing formats. This project aims on implementing complete ODT/SXW files import (i.e. enable the import of ODT/SXW files as files, not just as formatted text). This is achieved by implementing a plug-in to import ODT/SXW files. Export will be taken into consideration while designing the import.

=Motivation=

The need for the import of a word processing format into a desktop publishing software is undoubted. When it comes to the most widely used format, the problem becomes crucial. ODT/SXW are universal open word processing formats which are very popular, making it essential to be imported into Scribus. Scribus makes desktop publishing with free and open software possible, and I found it necessary to import ODT/SXW files directly into the workspace, for making it work better. As with all DTP software, Scribus is rarely used to write text, instead being used for layout purposes. Therefore, as I found during the time I spent using Scribus, it is necessary to import documents from a word processor. This will have wide-reaching usability implications for Scribus users in general. I choose SXW along with ODT, since there are some negligible changes between SXW and ODT file formats. Moreover, I'd appreciate to work with an open source project like Scribus and the open source community during this summer.

=Objectives=

The project's objectives are to produce:


 * ODT/SXW import and, if time permits, ODT export for Scribus via plug-in.


 * Document the plug-in.

=Challenges=

The major challenges that will be faced during the implementation of this project are


 * Reading ODT/SXW files.
 * Extracting graphics, tables and formulas and if necessary convert them to a format that can be used by Scribus (e.g. SVG).
 * Import basic and common items like text, graphics, page setup, margins, page numbering, static headers and footers, paragraph and character styles and document meta-data as supported by Scribus.
 * Open an ODT/SXW in the workspace by converting it to a Scribus document.

=The Plan=


 * Decode (read the XML) the ODT/SXW files from the compressed form and save them in a temporary storage space.
 * Read jpg, svg and text content in the ODT/SXW.
 * Different font styles, page layout and margin will be read from the source file and imported to the Scribus document.
 * Convert the entire ODT/SXW file into a Scribus document.
 * Positioning of the objects has to be obtained correctly.

=Timeline=

April 28-May 08: Understand the XML formatting of ODT and SXW files. Read the documentation of Scribus completely, especially more about how to program Scribus. Communicate with mentors and clear basic doubts about the project.

May 08-May 17: Design the structure of the plug-in,the scope of import and minimum support required other than proposed and discuss its structure on IRC and mailing list.

May 18- May 20: Open the ODT/SXW files using a newly created plug-in from Scribus. Extract ODT to a temporary folder for further operations.

May 21- May 22: Testing -Phase I

May 23- June 28: Import Different components from the ODT/SXW file.

June 29-July 10: Import different styles like font styles and paragraph styles from ODT/SXW file.

July 11-July 24: Convert the entire file into a Scribus document.

July 25 -July 30: Testing and Debugging Phase II.

July 31 - August 11: Complete the coding and sync the entire project.

August 12-August 17: Complete the documentation and arrange the the code (Clean Coding).

Note: Timeline doesn't include interaction with the mentor every day.

=Project Outcomes=

The expected outcomes of this project are as follows:


 * ODT/SXW files can be imported with basic contents and to be expanding the support as time allows with a minimum guarantee for basic and common items like text, graphics, page setup, margins, page numbering, paragraph and character styles in ODT/SXW files to be imported into Scribus. (Not a complete ODT/SXW importer, which would be too large to be fit in the limited time frame of GSoC 2009.)


 * Proper documentation acceptable to the Scribus maintainers accompanies the code.

=About Me=

Name: Anoop Thomas Mathew

University: Cochin University of Science and Technology(CUSAT),Kerala,INDIA

IRC Nick: atmb4u

Email ID: atmb4u@gmail.com

Location: Kerala, India

Timezone : GMT + 5:30

I am Anoop Thomas Mathew, and am currently studying for B. Tech degree in Computer Science and Engineering, at College of Engineering, Chengannur, Kerala, INDIA.

I have been using GNU/Linux for the past 5 years, starting off with RHEL 4.

I have good knowledge in C, C++, Java and Python, and I've been into coding in them for the past 5 years. I started learning programming when I was on 3rd grade with LOGO, then BASIC (5th), C (6th), C++ (9th) and python (11th).

I am a member of the FSF since 3 years and have fixed a few bugs in the VIM text editor. I am in the VIM development mailing list for more than one year. I don’t have any previous experience with Scribus, but I hope to work with Scribus through GSoC 2009.

=Eligibility=

I can provide you with the certificate from my college, proving information about my studentship till 2010. Also I can provide with a copy of my passport showing my age details on request.