SCIndeks Premium package: Articles in HTML and interactive PDF
Centre for Evaluation in Education and Science, Belgrade
Abstract: The production, purpose and significance of HTML and interactive PDF (iPDF), as the dominant publishing formats in modern scientific journals, are described. Presented are the benefits of the HTML format, as well as the functionalities of the interactive PDF as a by-product of the HTML production based on XML. In addition, a detailed description of some newly implemented technologies in SCIndeks is given. Functionalities of the Premium package are fully described. Finally, a new SCIndeks interface, created to enable the integration of HTML versions of papers into the existing application is reviewed.
For more than a quarter of the century research papers have been offered almost exclusively in the Adobe PDF format (Portable Document Format). Being similar to printed papers and suitable for printing, as well as for easy sharing via the Internet, PDF quickly replaced the traditional hard copies of papers. However, with further development of the Internet, the growing need for quick access to information and the increasing use of mobile devices, the PDF format has proven to be too static and outdated. Thus, a new, more suitable format, HTML (Hypertext Markup Language), a markup language used for documentation to be displayed in web browsers, gradually imposed itself as the standard of modern academic publishing 1, 2.
HTML - significance and functionality
HTML is widely accepted thanks to the many benefits it provides:
- HTML is a universal format and the “language of the web”, because properly formatted HTML file is the easiest format that search engines like Google Scholar, Bing, and Google can access and read.
- Interactivity is achieved through internal and external links, as well as audio and video content.
- It can include supplementary material.
- Papers in HTML format are most suitable for effective online use, while their conversion to PDF enables offline reading.
- HTML format supports offline content storage which allows users to access the paper and read it without internet connection, if necessary.
- It is both desktop and tablet computers and mobile devices friendly.
- It is shareable.
- HTML format requires less memory storage than PDF format.
XML production and conversion to HTML format
The first step in HTML format production is to create an XML file (Extensible Markup Language) 3. XML is a simple, very flexible text format derived from the Standard ISO 8879. Mainly, its JATS XML version, designed for the purpose of presenting and publishing scientific papers on the web is used. The JATS format is designed to be upgradeable, allowing the publisher to implement useful innovations as well as to tailor certain functionalities to specific users’ needs. Also, it can be converted to various formats including the PDF. Its files are machine-readable and searchable, which means that search engines can access, read, and index them.
In the HTML output produced by CEON/CEES, in-text citations are related and linked to the references in Bibliography, within a workflow that simultaneously ensures their validation. This procedure is a substitution for the use of CiteMatcher, a tool developed in CEON/CEES for detecting duplicate or uncited references. In the HTML format, by clicking on the in-text citation, the user reaches the appropriate reference in the Bibliography (internal links). More details about the cited reference are obtained by accessing the cited paper in SCIndeks, PubMed, PubMed Central, Google Scholar and CrossRef using external links. The bibliography and in-text citations are displayed in accordance with the citation style used by the journal (APA or Vancouver). In addition to references, tables, figures and charts are also linked. They can all be enlarged, while figures can additionally be downloaded in the PPT format.
Interactive PDF (iPDF) as a by-product
XML is suitable for conversion to a variety formats, including PDF. PDF produced from XML, just like HTML, has internal and external links ensuring navigation within the article and between the article and related documents. It is an essentially different format from PDF, hence it is labeled differently - as the interactive PDF (iPDF). iPDF is much more user-friendly and therefore more popular among researchers. Given that it is actually a relatively cheap by-product of the so-called XMLization process, it is widely accepted by journal publishers.
SCIndeks Premium service package
After mastering the technology of HTML and iPDF production via XML, which is regarded a significant development success of CEON/CEES, an additional package is now offered in SCIndeks – Premium service package 3, 4, 5, 6. Thus, SCIndeks has finally reached the technological level of the world's leading bibliographic databases and publishing platforms. This will expectedly lead to the higher reputation of SCIndeks journals, and their greater visibility and potential citability. During the XMLization process, we have also developed tools and procedures that keep the cost of production of the two newly introduced formats at the minimum on a global scale, making Premium services available even to modestly funded local journals. This makes the success of CEON/CEES even greater.
The HTML version of articles offered in the Premium package is equipped with all the functionalities that HTML as a format enables and is at the same time compliant with the specific requirements and publishing practice of local journals. These same circumstances have been taken into account in the process of producing the iPDF format. Editorial boards are allowed to choose between different layouts of iPDF (traditional or multicolor), modes (single-column or double-column), font types (Times New Roman or Candara), etc 7. Additional changes in the design of the PDF format are possible but they are charged extra, depending on the scope and complexity.
For the purpose of promoting the Premium package, CEON/CEES has prepared a number of pilot articles in HTML and iPDF formats. They are all from SCIndeks journals that have expressed interest in the Premium package. The list of journals with links to their pilot HTMLs and iPDFs is given in Annex 1.
Inovated SCIndeks interface
By implementing new HTML and iPDF formats in SCIndeks, the SCIndeks interface has been forcefully upgraded, and at the same time improved. With its interactivity and plenty of additional information about articles, it is more customized to researchers’ needs.
Metadata are displayed in the central frame of the article page, while the format of the full text can be selected by clicking on the appropriate icon-link above the title of the article (Figure 1.).
The full text in the HTML format (Figure 2.) opens on the same page, just below the displayed metadata. It contains expandable tables, charts, linked references, etc. In the left pane, there is the interactive article content, which is always visible and accessible to users and enables easy and quick navigation through the article. Thanks to the mouseover and hover functions, all references are immediately visible to the user in the article, which further facilitates reading.
Articles display in SCIndeks is enriched with additional metadata (Figure 3.), such as the peer review method, submission, revision, and acceptance date, date of pre-publication (Online First), date of publication (of printed version of the journal or the version on the publisher's website), and date of publication in SCIndeks.
Also, the list of Related records (articles that share the largest number of cited references with the current article) is now displayed to offer the most relevant records in a prominent place, in the right frame, while the rest is available at the link below (Figure 4).
A new option enabling sharing on social networks is introduced. Users are also provided with the useful option "Show in both languages", displaying, on a click, article metadata in both languages, Serbian, and English. This view can also be switched off on one click. (Figure 5.).
Additionally, the Altmetric badge has been introduced, tracking the online events related to the published article, such as mentions on Twitter, downloads within Mendeley, citations in articles on Wikipedia, etc.), as well as a Dimensions badge, which shows citation counts in this base (Figure 6).
In summary, along with the service items available in the Standard and Professional package that are subsumed, Premium includes the following functionalities (listed also in SCIndeks on the MyJournal page):
|Searching titles, KWs, abstracts, author names, cited authors names, and references: SCIndeks engine|
|Article full text search and download: Repository of SCIndeks|
|Metadata conversion to a format suitable for download by other services: OAI PMH|
|Normalization of project titles and identifiers listed in Acknowledgments: FundRef|
|Assigning persistent identifiers to authors: ORCID|
|Assigning DOIs and maintaining permanent access to the full text: CrossRef|
|Reference normalization based on normative databases (CrossRef, Medline, SCIndeks) and their inclusion into the citation counting corpus|
|Metadata online integration: functions "cited authors", "cited titles", "cited in", "related records", and "link to cited article"|
|Journal Bibliometric Report: Ranking journals according to ten indicators of impact and nine indicators of bibliometric quality; compliance with publishing and ethical standards; “categorization” of journals|
|Article metrics: citations in SCIndeks, CrossRef and Dimensions; number of visits and full-text downloads; mentions on social networks (Altmetric)|
|Journal online management and publishing system: SCIndeks Assistant |
|Articles pre-publication - Online First: SCIndeks Assistant|
|Article quality support: MindTrap , RevRev , CrossCheck |
|Full support for journal licensing and indexing in DOAJ: P&LSS - defining journal policies, preparation of application; communication with DOAJ evaluators; after acceptance to the DOAJ, creation and regular submission of article metadata in the XML format to be visible in DOAJ and reindexed in the Dimensions database|
|Publishing research data and supplementary material related to articles: Figshare|
|Support for suggesting reviewers by the authors of the submitted manuscripts |
|Application for indexing in WoS: Prospects Analysis and Application Preparation (PAAP) - for selected journals with the best prospects for acceptance, based on intensive monitoring and additional evaluation; subsequent reapplications|
|Application for indexing in Scopus: Prospects Analysis and Application Preparation (PAAP) - for selected journals with the best prospects for acceptance, based on intensive monitoring and additional evaluation; pre-evaluation, communication with the Scopus evaluation team; final application; subsequent reapplications|
|Consulting on strategic development of the journal: recommendations for the language of publication, the journal format, the review policy and legitimate techniques for increasing impact|
|Article publishing in the HTML format: internally and externally related citations/references; enlarged tables and graphs; for medical journals indexed in PubMed Central: preparation and regular submission of articles in the XML format to be published as HTML|
|Article publishing in iPDF format: CEON/CEES standard design, with an option of selecting the layout (traditional or multicolor), mode (single-column or double-column), font type, etc.; internally and externally related citations/references|
| online submission, review, production and publication of articles
 warning the author of the manuscript and the editorial board to illegitimate references in the article (which comes from withdrawn / corrected papers and fake / predatory journals)
 evaluation of the quality of reviews by the editors and their usefulness by the authors of the manuscript
 plagiarism prevention (CrossCheck/iThenticate)
 optionally, for the editorial board
Table 1: The Premium package functionalities
For Premium medical journals indexed in PubMed Central, CEON/CEES regularly submits XML of articles in a special format 8. An example (Journal of Medical Biochemistry) is given in Figure 7. The article full text produced by CEON/CEES is available on relevant pages, e.g. here.
- Brown, A. (2003). XML in serial publishing: Past, present and future. OCLC Systems & Services, 19, 149–154.
- Hoekman, A. (1999) Journal Publishing Technologies: XML, http://www.msu.edu/~hoekmana/WRA%20420/ISMTE%20article.pdf.
- Wusteman, J. (2003). XML and e-journals. OCLC Systems & Services, 19, 125–127.
- Cui, B., Chen, X. (2010), An improved hidden Markov model for literature metadata extraction. In: Advanced Intelligent Computing Theories and Applications, 6th International Conference on Intelligent Computing, pp. 205–212
- Giuffrida, G., Shek, E.C., Yang, J. (2000) Knowledge-based metadata extraction from postscript files, ACM DL, pp. 77–84
- Huh, S. (2014). Journal Article Tag Suite 1.0: National Information Standards Organization standard of journal extensible markup language. Sci. Ed. 1, 99–104
- O’Gorman, L. (1993) The document spectrum for page layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1162–1173
- Zou, J., Le, D.X., Thoma, G.R. (2010), Locating and parsing bibliographic references in HTML medical articles. IJDAR 13(2), 107–119
Appendix 1. Pilot articles in HTML and iPDF formats
|2. Journal resuscitatio Balcanica||shorturl.at/iyFG8|
|3. Archive of Oncology||shorturl.at/uxF04|
|4. Scripta Medica||shorturl.at/alrQ7|
|5. Srpski arhiv za celokupno lekarstvo||shorturl.at/ezQUW|
|6. Vojnosanitetski pregled||shorturl.at/ceIVX|
|7. The University Thought - Publication in Natural Sciences||shorturl.at/nFHV8|
|8. Ratarstvo i povrtarstvo||shorturl.at/bxELW|
|9. Specijalna edukacija i rehabilitacija||shorturl.at/kqDSZ|
|11. Vojnotehnički glasnik||shorturl.at/clmBS|
|12. Glasnik Advokatske komore Vojvodine||shorturl.at/lnBQV|
|14. Psihološka istraživanja||shorturl.at/coCHL|
|15. Zbornik radova Filozofskog fakulteta Univerziteta u Prištini||shorturl.at/bjmQU|
|16. Journal of Agricultural Sciences (Belgrade)||shorturl.at/dxFOX|