GSoC 2009 Accepted Applications

In no particular order:

=Search and Replace=

Rationale

 * Text is a substantial part of many Scribus documents. 'Search and Replace' is one of the most fundamental and widely used text manipulation operations. The current implementation lacks many features while having a non-modular UI.
 * Reimplementing it with new features will increase both the usability and popularity of Scribus.

Motivation

 * Scribus is the leading Open Source Desktop Publishing software. However, commercial programs appear to have better implementation of such features as 'Search and Replace'. Being widely used, this feature has to be improved. I have a lot of programming experience in C. A desire to write a feature for major software like Scribus from scratch is my 'Developer's Dream'. I want to fulfill it and also help to improve Scribus. Interacting with the Scribus team has been an absolute pleasure. Plus, the 'Scribus Culture' has impressed me so much that I am no longer trying to apply to any other GSoC organizations.

Overview

 * After discussing my plans with the Scribus developers, the following goals are to be met by the time of completion of the project:


 * 1) Rewrite the source code to make it more modular starting with separation of GUI from the search logic.
 * 2) Implement support for regular expressions in the search using Qregex.
 * 3) Implement simultaneous search and replace in multiple documents
 * 4) Improve user interface by separating Basic Search and Advanced Search.

Essential

 * Following features have to be implemented by end of GSoC 2009:-


 * 1) The new 'Search and Replace' will have two 'Search and Replace' dialogs. The first one would search in a particular text region. The second will offer the user to search the complete document or selected parts of it. It will first categorize the entire Scribus document to enumerate its individual parts and then invoke the first 'Search and Replace' on all or only selected parts.
 * 2) The new 'Search and Replace' dialog will have two modes. One will perform basic 'Search and Replace' using text based matching. The second mode will be the full fledged dialog similar to what we have today. The second one can be invoked from the first by clicking on the 'Advanced Search' button or can be invoked directly.
 * 3) New Search and Replace will support Regular Expressions. That means we can for instance find all words beginning with or containing a particular string or character at a particular location.
 * 4) The code will be divided into two components - the GUI and the actual code logic. This would allow 'Search and Replace' to be extended much more easily later on. As a feature might require an addition to the UI part and a separate code backend, changes in one can be done independently of the other.

Optional

 * If any time is left, I would also like to implement the following ideas:


 * 1) To make the 'Search and Replace' dialog more informative, proper message boxes would be provided such as 'Match not Found' unlike the current austere 'Search Complete' message.
 * 2) Depending on the 'textbox' support by Qt, I could add an incremental search capability. This feature would need a mechanism in text frames to send the characters as soon as they are being typed rather than only after the 'Return' key is pressed.
 * 3) To have 'Search and Replace' automatically take input from the text highlighted by user when it is invoked (From the bugtracker).
 * 4) To have a 'Warp Around' feature. If a user enables it then when the end of the document is reached, search starts from the beginning again. This would be needed only for the search component.
 * 5) To shift the 'Search and replace' dialog aside when a user needs to view the documents being searched.

Scribus Users Feedback
Please dont forget the 'style' searches :


 * searching occurences of 'needle' with bold characters
 * searching occurences of 'needle' with font size = 10
 * searching occurences of 'needle' with Title1 style
 * searching occurences of 'needle' with not-bold characters
 * searching occurences of 'needle' with characters (either bold or not-bold) AND italic
 * etc... with all possible mix, including negative criteria and unknown values (3 states buttons)

Timeline

 * May 23   – June 10: Basic 'Search and Replace' without Regular Expressions
 * June 10  – June 23: Basic 'Search and Replace' with Regular Expressions
 * June 23  - July 6: Advanced 'Search and Replace'
 * July 6   – July 21: Global 'Search and Replace'
 * July 21  - August 10: Exploiting bugs and patching them; implementation of additional ideas if time permits.
 * August 10 - August 17: Documenting the source code; writing about other ideas that can be implemented.

Deliverables

 * 1) A new "Search and Replace" UI with all the features mentioned under the "Essential" portion of “Implementation”.
 * 2) Project documentation.
 * 3) Some or all features from the “Optional" portion of “Implementation”

BIO
Meetanshu Gupta : meetanshu.gupta@gmail.com
 * I am a final year Undergraduate student in Computer Engineering at the Malaviya National Institute of Technology, Jaipur, India. I have been programming in C since Class 6 and C++ since Class 11. I have worked on Qt2 to create a UI for the C++ project "Department Management Software." I have done many projects in C, C++ for academics, competitions and at college level. My interests include programming (C/C++/Java/C#), algorithms, network programming. Last summer I had a two-months internship at Microsoft IDC, Hyderabad, India and got some hands-on experience on real world projects. Since then I have taken interest in reading and understanding Open Source codebases (Scribus, Vim, OpenOffice.org), learning debugging techniques and amicably working in teams.

=Better Masterpages=

ABSTRACT
 * Scribus is the only professional-grade Open Source Desktop Publishing Software that supports many professional publishing features. Master pages are an important part of DTP. Presently, editing of master pages is not very flexible. It's also not possible to edit the master page features on a normal page, which is a major drawback for many users. This project aims to incorporate new functionality into Scribus that will to enhance the editing of master pages.

MOTIVATION
 * Scribus makes desktop publishing with free and open software possible. Master pages are an important professional requirement. Users may want to change some features of the master page in the current normal page without changing the actual master page or creating a new one as is currently required. The current way to edit master pages is cumbersome and time consuming. The completion of this project will enable users to handle master pages and related operations with ease and flexibility, which will be widely appreciated in the desktop publishing world.

PROJECT
 * The project aims at incorporating the following features into Scribus

1. Options for editing master page object occurrences on a regular page
 * This task provides options for the user to
 * - move or resize occurrences
 * - change the style of occurrences
 * - remove occurrences from a regular page

without modifying the master page itself. Also, it allows a user to revert occurrences to how they look on the master page with options to retain content, style, size and position if necessary. An alternative approach might be to allow direct access to master page items via modifier keys.

2. Quick Access to Master pages


 * A quick access option 'Master Page' to be provided in the status bar which would provide access to the master page of the currently selected page. All the master page related operations may be made available here.

3. Master page Listing


 * Listing the available master pages under the Page list in the status bar, so that users could easily select the master page of their choice, edit it if needed and apply it to the current page.

4. Preview of a Master page


 * Adding a "preview" option to the "Apply Master Page" dialog box, which appears on bringing up Page -> Apply Master Page, which would provide a thumbnail preview of the selected master page or of the master page that one tries to apply to the current page.

DELIVERABLES


 * 1) Options for efficient editing of items on a master page of the current page.
 * 2) Master page Listing and Preview for better master page selection.
 * 3) Quick access option 'Master Page' to list all master page related operations under the Page list in the status bar.
 * 4) Documentation

TIMELINE


 * Up to May 23, 2009      -  Getting familiar with Scribus source code and getting clarifications and refinements for the proposed ideas.
 * May 23  - June 10, 2009 -  Implementing the first part of the project.
 * June 10 - June 13, 2009 -  Discussions with mentor about the first part and fixing bugs.
 * June 13 - July 1, 2009 -  Implementing second and third parts of project.
 * July 1 - July 3, 2009 -  Discussions with mentor about the latter two parts and fixing bugs.
 * July 3 - Aug 1, 2009   -  Implementing the last part of the project.
 * Aug 1   - Aug 3, 2009   -  Discussions with mentor and fixing bugs.
 * Aug 4   - Aug 14, 2009  -  Final review of code and documentation.

ABOUT ME


 * First Name : Dhanashree
 * Name       : Nellayi Prasad Dhanashree
 * Email Id   : dhan0110@gmail.com
 * IRC nick   : dhan0
 * Location   : Kerala, India
 * Timezone   : GMT + 5:30
 * blog URL   : csianthoughts.wordpress.com


 * I am Dhanashree N P, a second year (fourth semester) student of B.Tech Computer Science and Engineering at the Government Engineering College, Thrissur. I have been coding in C++ for the last 4 years and C for the 2 years. I know basics of Python and SQL, and I have started learning Qt. I have also started reading the source code of Scribus.
 * I have done coding in C++ as a part of academic projects for my school. I am a member of Free Software Users Group, Thrissur, and have already used Scribus for designing a 3-fold brochure for a seminar conducted by the group.I haven't been involved in any Open Source development projects so far, but I am looking forward to do that as a student of GSoC 2009 with Scribus.

=Improving PDF Export=

Abstract
Over the course of developing Scribus, a lot of work has been done to make Scribus's output formats satisfy a wide range of user needs. As PDF is the de facto standard for outputting electronic documents, making Scribus's PDF output more flexible in term of users' requirements is highly desirable. This project aims to address a few important improvements that could be applied to Scribus's PDF exporting features. The main focus will be supporting PDF/X-1a, PDF/X-4 export and improving the embedding PDFs feature of Scribus. Depending on the actual progress of this project, it could extend to cover the implementation of embedding fonts' subsets when exporting PDFs.

About PDF/X
Scribus is currently be able to verify and produce PDF/X-3 documents. PDF/X-3 is one particular standard of PDF/X&mdash;a subset of PDF which concentrates on setting a standard (or rather an agreement) for exchanging documents between the creators and commercial printers, so that the creators can be assured (to some extent) about the fidelity of the final printed document. Basically, PDF/X requires that
 * all fonts are embedded.
 * only printing content is presented and any extra live content like security, forms, annotations, bookmarks, sound, video, etc. is prohibited.
 * printing conditions must be present (output intent).

PDF/X-3
The PDF/X-3 is an ISO standard that has a capability of attaching an ICC color profile to a document to explicitly manage the color of the document as opposed to leaving the color to be device-dependent. This feature provides extra assurance for the users in term of the fidelity of the color representation in the final document. See for more information on how Scribus supports this feature.

PDF/X-1a
This particular standard of PDF/X is a strict subset of PDF/X-3, where the main restriction is that the color data has to be converted to CMYK before exchange is performed. This works best if the CMYK color profile to be converted into is clearly defined (this is usually the case in the US). It is also worth noticing that a PDF/X-1a compliant file is also PDF/X-3 compliant, since the latter standard clearly supports this idea.

PDF/X-4
This, on another hand, is a superset of PDF/X-3, based on PDF 1.4, in which the main extension is allowing transparency and layers. The main reasons for the restriction of transparency in PDF/X-3 is because over the years, most of the printers were PostScript printers with their RIPs (Raster Image Processors) working with PostScript representations of the documents. Since PostScript was developed before the notion of transparency was introduced, it naturally could not handle transparency content (hence the restriction in PDF/X-3). Lately, high-end printers became capable of working directly with PDFs at the RIP level and therefore could handle transparency and layers in PDF files. PDF/X-4 was consequently created in order to take advantage of this new technology.

PDF Embedding
At the moment, users can select PDF files to be added into the Scibus document's content (via "Insert Image Frame" and choosing the PDFs). By the PDF export time, users have the option of "truly" embedding these PDF contents into the output or just simply including them as raster images.

Unsurprisingly, choosing to embed these contents is more desirable since it preserves the original content and therefore gives a better result when it comes to high-end printing. This is made possible by including the PDF contents as Form XObjects into the resulting PDF. One main focus of this project is an attempt to improve this feature, especially with respect to color spaces.

Font Subsetting in PDF Export
In the current version of Scribus, when exporting to PDF, users have the choice of embedding fonts used in the document into the output PDF. This is again very desirable since it preserves the document's appearance better and in fact, PDF/X flavors of PDF require all fonts to be embedded.

The way Scribus handles this at the moment is either by embedding the whole font file or making it into outlines&mdash;the used glyphs appear to be drawn manually (using path construction operators) in the Font's dictionary inside the document's Resources where the font is defined. A more common and natural way of doing this is to embed a subset of the font, so that only used glyphs are included.

PDF/X-1a and PDF/X-4 Exporting
This part of the project will implement the features of exporting to PDF/X-1a and PDF/X-4, maybe even PDF/X-5. Upon completion, Scribus should be able to export documents that conform to these standards. This is also meant to extend the functionalities of the Preflight Verifier, so that it will be able to verify documents against these standards.

As I mentioned earlier, supporting PDF/X-1a would be quite straightforward, since Scribus already fully supports PDF/X-3. Thus, by restricting the output to only use CMYK, I could create an otput filter to produce valid PDF/X-1a files.

Supporting PDF/X-4 is a bit more complicated though using PDF/X-3 rules as the basis I should be able to allow live transparency/layers to be included in the documents.

PDF Embedding
The PDF embedding feature of Scribus is basically working at the moment, however, the PDF contents are embedded "as-is" - the PDF content is included as a form XObject in which it will be painted exactly as if it was a stand-alone PDF.

The problem however arises in case we need to explicitly manage the color profile of the final output (e.g. force all colors to CMYK in case of PDF/X-1a) - the colorspaces, which are used in the embedded PDF contents, need to be consistent with the whole document. At the moment, this is not the case in Scribus and therefore, this part of the project will try to solve that. This will involve converting from one colorspace to another and it is anticipated that the level of complexity of this project will be high.

Font Subsetting in PDF Export
This idea will be saved as one possible extension of the project if time permits.

As discussed previously, this part of the project will aim to implement the font subsetting feature for PDF export to only embed a subset of the font containing the glyphs that are actually used in the document. This would improve the efficiency of the PDF output as the resulting document would not have to contain the entire font file.

Time Line

 * Now – May 23rd: Preliminary study: PDF, PDF/X, PoDoFo (PDF library used by Scribus), Scribus PDF Exporting source code, Qt.
 * May 23rd – June 15th: Implementing the PDF/X-1a & PDF/X-4 (X-5) export.
 * June 15th – July 15th: Improving the PDF embedding feature.
 * July 15th – Aug. 03rd: If the PDF embedding feature is done, move on to implement the font subsetting feature; if not, carry on with the PDF embedding.
 * Aug. 03rd – Aug. 17th: Finishing off the project, documentation and testing.

Participant Information

 * Name / University / current enrollment information
 * Name: Thach Tran
 * University: University of Nottingham, UK
 * Enrollment status: 3rd (last) year, Bachelor of Science (Honours) Computer Science
 * Biographical sketch
 * I'm an undergraduate student at University of Nottingham, UK. My major is Computer Science. I have great interest in digital documents, enterprise computing and programming languages in general.
 * I'm finishing the dissertation for my degree at the moment, where I developed a tool to convert documents from PDFXML (a.k.a. Mars) to PDF. See the project's page at Google Code and my interim report
 * Last year, I also participated in a group project as part of my study. The project was aimed to develop an Audio DSP software (called PASTA :-) i.e. Palette Augmented Sound Transformation Application) which can apply different effects to sound signals. Our team finished quite impressively and got a very high grade. See and  for more details.
 * For your information, feel free to take a look at my resume.
 * Did you ever code in C, C++ or Python? Please provide examples of code
 * I have coded in C and C++ for over a year now. However, I have never got a chance to study/code in Python. I wish I could start learning it soon.
 * C and C++ code examples. These are coursework that I did as part of my course.
 * C: Text-based Mastermind game
 * C++: Pacman game
 * Do you use Scribus? Please provide examples if you do.
 * I have never used Scribus before.
 * Do you make other use of Scribus than for laying out articles? Please describe and show
 * As I said earlier, I haven't used Scribus before. Now that I'm aware of it, I certainly wish I have a chance to use Scribus in the near future.
 * Have you been involved in Scribus development in the past? What were your contributions?
 * Unfortunately, no, I haven't.
 * Have you been involved in other Open Source development projects in the past? If yes, please tell us project, when and in what role were you involved.
 * No, I haven't.
 * I have actually used a lot of software/tools from the Open Source community, but the closest I ever got to interact with the community is subscribing to some mailing lists to participate in discussions. I have been involved in coding for up to 4 years now (since I started going to college) and I would think now is the time I could give back something to the community.
 * Why have you chosen your development idea and what do you expect from your implementation?
 * Since I started using computers to prepare my documents, I have always been impressed by the fact that PDF helps to preserve the appearance of my documents efficiently. Whether I have to bring my documents to a print shop to get them printed out or I have to send them electronically to different people, I always want to be assured that my work has a consistent look.
 * Along that line, I have developed my interest in PDF and in digital documents in general. I had my chance to actually work with PDF via the project I'm doing now at university, and since I enjoyed it so much, I wish I could follow the trend here in GSoC 2009.
 * Digital publishing in general is a challenging field where it requires great attention to tiny details in order to produce beautiful and professional digital outputs. Scribus, as a famous software in this field, is a magnificent tool which helps users to design compelling page layouts and sensational publishing documents with ease. Since the ultimate purpose of using the software is to "publish" the work, exporting the result to PDF plays a key role in the software.
 * As everyone who has been to the low-level details of PDF can testify, PDF as well as colorspace management, font embedding and such are very challenging and might even seem tedious to a lot of people. This is exactly where I find my inspiration; I enjoy working with details and low-level programming.
 * While the PDF exporting feature of Scribus is already in good shape, there are still a lot of things that could be improved to make PDFs produced by Scribus more robust and more "press-ready". This is the main motivation behind the project.
 * I would expect the project, upon completion, to add some useful features to Scribus and consequently, satisfy the demand from users. As of for myself, I think the project will be a chance to learn more about software development from fellow developers and moreover, to learn more about the great community of Open Source.
 * Are you you ready and willing to sustain a good level of communication with your mentor and the Scribus Team overall and be open and forthcoming about the progress of your project including coding and personal problems related to your GSoC project?
 * Of course, I'm really looking forward to work with the Scribus Team. I had an extremely good experience preparing for this proposal; I have been in contact with the community along the way and I really enjoyed exchanging my ideas with people in the team. We did encounter a little conflict at first but it was easily settled down and I'm sure things will be the same over the course of my project (if I get selected, anyway :-) ).
 * Contact details
 * Email: tranngocthachs@gmail.com
 * Phone: +447942606550
 * IMs: tranngocthachs on Skype and thachtran on freenode
 * Time zone: British Summer Time (GMT + 1)
 * I will be staying in the UK up to 23rd of July to attend my graduation ceremony. After that, I might move back to my home country as my study has finished. In that case, my phone number will be +84905211803 and I'll be in the time zone of Vietnam GMT + 7. I will keep you guys posted on any changes, but I'm sure this will not affect the project whatsoever.

=Implementing XPS support=

SYNOPSIS

My project aims at improving the import and export features in Scribus by implementing XML Paper Specification (XPS) support. After this project is completed, Scribus will be capable of handling Microsoft's latest XPS document format files in both import and export.

OVERVIEW

XPS Documents maintain a consistent appearance for documents—despite environmental variables—through the use of a fixed page layout and new technologies such as the Open Packaging Conventions, the XPS print path and XPS Viewer. XPS document format consists of structured XML markup that defines the layout of a document and the visual appearance of each page, along with rendering rules for distributing, archiving, rendering, processing and printing the documents and allowing it to incorporate vector-graphic elements in documents. The ability to import XPS files allows users to easily extract data from any file generated by future Windows applications. The XML Paper Specification (XPS) describes electronic paper in a way that can be read by hardware, software and even by people.

DELIVERABLES


 * XPS export feature files with extension .xps.


 * XPS import feature files with extension .xps by using ghostxps.


 * Documentation

PROJECT SCHEDULE

April 21, 2009 Start of the project. Discussions with mentor to make the ideas clear. Detailed study of source code. Collecting the necessary components for the completion of the project.

July 03, 2009 Complete the XPS export and start revising this feature and fixing bugs after discussions with the mentor and the development team.

August 10, 2009 Complete the XPS import and therefore the entire project will be completed and submit all deliverables for the final evaluations.

ABOUT ME

First Name : Vipin

Name     : Vipin Vichattu Johney

Email Id : vipin.johney@gmail.com

IRC nick : vipx

Location : Kerala, India

Time zone : GMT + 5:30

Education :‭ ‬Computer Science Engineering

I am Vipin Vichattu Johney and currently a second year (4th semester) student for B.Tech at Govt. Engineering College, Trichur, Kerala, India. I have been coding C++ for the last 5 years and coding C and Basic for the last 2 years.

I have been participating in many programing contests. I have done many projects in C++ and C. I know basics of SQL and Python programming languages. I am an active member of Free Software Users Group, Thrissur http://fsugtsr.org. I have started reading the source code of Scribus.