• Nem Talált Eredményt

CONTENT PROVIDERS: LIBRARIES, ARCHIVES, MUSEUMS

UDC searchability in the ALEPH 500 integrated library system at the National Technical Information Centre and Library (BME OMIKK)

Prokné Palik Mária <mpalik@omikk.bme.hu>

BME OMIKK

Marton József <jmarton@omikk.bme.hu>

BME OMIKK

During document classification we use the UDC (Universal Decimal Classification) classification system in our library. Library patrons meet the result of our work using the online search interface. Integrated library systems usually offer two kinds of searching methods. The browser displays the documents of the library according to the hierarchic structure of UDC. The direct searching gives the documents' bibliographic description as a result set which has to satisfy the double requirement of completeness and accuracy.

Searching for UDC codes we have to consider their structure and the semantics of their parts.

Improving the on-line catalogue of BME OMIKK lead to the integration of a new searching facility into the system, which makes it possible to search for semantically independent parts of UDC codes. Using this search method we can obtain a lot of new information from the database which was hidden earlier. In our talk we would like to show some interesting, convincing examples to illustrate how this improved system helps to find simple and complex codes moreover independent and non independent generally common subcodes. We also make an attempt to give the completeness and accuracy of search result sets.

Moreover we would like to emphasize the importance of searching content in the catalogues of libraries. By developing user interface for information searching languages, and by expanding them with different search methods, uses of documents can be increased.

A comparison of the performances of scientific publicational databases in Hungary

Adomáimé Zolnai Dóra <dzolnai@atomki.hu>

MTA Atommagkutató Intézete Zolnai László Dr., CSc. <zolnai(a)atomki.hu>

MTA Atommagkutató Intézete

The main focus of this paper is to reveal the cause of functional anomalies by the analysis of five representative Hungarian scientific publication databases. In the first part of the discourse after a short historical outline the authors review the relevant definitions and present the sample databases:

• Scientific Publication Database of the Hungarian Academy of Sciences (ATPA),

■ Publication Database of the Public Body of the Hungarian Academy of Sciences (KPA),

• Teacher’s Publication Database of the Faculty of Arts of the Péter Pázmány Catholic University (PPKE BTK OPA),

■ Publication Database of the University of Miskolc (MEPA),

■ Publication-Citacion Database of the Institute of Nuclear Research of the Hungarian Academy of Sciences (APHA)

Also, they analyze

• the user side of the database

• the means of scientometric evaluation provided by the database

• the functional system development and economic background of the database

Additionally the authors classify the main points to achieve in the case of either existing or future databases covering the full area of the country:

• the additivities filling database

• the occurrence of “inactive” authors and authors of the same name the problem of multi-author publications and references

■ the question of self-reference

• the intricacies concerning SCI and non-SCI publications

■ the issue of achievement additivity and separating individual and institutional achievement

■ the potentiality of generating subject-headings

■ the subject of augmentation with repository

■ the economic background of the database

In conclusion the authors give their recommendations concerning the issues listed above.

It is shown the main reasons of the observed disfunctionalities in the performance of the investigated databases are due to:

• the unclarified relations between the demand of the scientometrical evaluation and the content of the databases,

■ the underestimation of the cost of the actuation of the databases.

The e-portfolio and its potential impact on libraries serving higher education institutions

Czeglédi László <laszlo.czegledi@ektfhu>

Eszterházy Károly Főiskola Központi Könyvtár

The evaluation of student assignments in the course of a training program results in several documents. While the submitted texts had primarily appeared on paper, in the past few decades more and more documents have been prepared in electronic form. Consequently, the increasing role of such non-paper based information carriers enables college and university students to substitute a traditional senior thesis crowning a multi year academic effort with the submission of a portfolio. This type of electronic documentation summarizes the main aspects of professional and personal development in addition to recording the respective stages of academic achievement throughout the training process. Whereas the introduction of the e-portfolio appears to have several advantages facilitating the solution of pressing problems related to the traditional senior thesis (including the enforcement and monitoring of authenticity, retrieval, and accessibility), this new technology presents heretofore unseen challenges for student and instructor alike. Furthermore, the electronic portfolio brings unprecedented tasks for librarians as well, since a library of any higher education institution must be capable of receiving and registering all documents produced during the instruction process in addition to providing access to these texts.

The challenges raised by the e-portfolio require the application of historically tested methods in addition to calling for new approaches. A significant dilemma concerns the very subject of the recording and registering process, since several types of portfolio can emerge ranging from comprehensive recordings of all student achievement until the prearranged selection handed in for the final examination. Consequently, in addition to the elaboration of a uniform

processing criteria system, such concerns have to be addressed as handling, preservation, retrieval, along with the physical and legal aspects of the accessibility of said documents.

One of the graduation requirements for students enrolled in teacher training programs at the Eszterházy Károly College is the preparation of an e-portfolio My presentation in addition to summarizing the respective institutional experiences and presenting a brief overview of the related options, provides an assessment of the potential concerns and challenges awaiting libraries serving higher education institutions.

CIDOC CRM compared with the MARC format

Simon András <andras.simon3@uni-corvinus.hu>

BCEEKK

Beside the widely used MARC format we have to care for the growing importance of the data description formats of museums nowadays too. These data description and data definition formats, the CIDOC CRM, CDWA Lite and Lido XML are compared with the MARC in this paper. It is also examined if one of these is ready for the recent museum automatization softwares to be based on. Maybe these are useless yet, or already ready for use right now, or will be ready in a couple of years. An important question is, if these are appropriate as end point for data migration or don’t, and perhaps databases having this unproved data formats, won’t be able the be to migrated without data loss in the future. We’ll examine the preconception, that the data structure of the museums is far more complicated and complex than that one, that we, libarian, managing the bibliographic databases and electronic catalogs are used to.

Poster collection exploration by the help of metadata at The University and National Library University of Debrecen

Takács Margit <mtakacs@Iib.unideb.hu>

Debreceni Egyetem Informatikai Kar Könyvtárinforma

Among the separate collections of the University and National Library University of Debrecen (Debreceni Egyetem Egyetemi és Nemzeti Könyvtár - DEENK) the poster collection represents a special dash of colour. The collection which was created in 1952 contains more than 1.5 million posters and other small printed papers. Because of the additional acquisition of posters from the 20’s and 30’s the collection adumbrates the graphical works of the whole century. The continually growing strength is indispensable source for the historical included the industry, economy, family and art history and any other kind of investigations. Accordingly the processing of these documents became very important.

As the posters represent a very special type of documents, their bibliographical description is different from the traditional ones. The poster is one way of the visual communication, which is composed by using graphical and typographical elements. When these kinds of documents are catalogued therefore it is not enough to report the title and the author items, it is also necessary to describe the “scenery” because of the special style of the document. The correct and complete description can help for the users to make head or tail of the small printed papers and to identify them precisely.

Beside of the traditional catalogues the digital libraries and electronic archives widely help the researches and investigations. In these libraries beyond the bibliographical items, depending on the copyright, it is available to reveal the digitalised form of the poster. In the University of Debrecen Electronic Archive (Debreceni Egyetem Elektronikus Archívuma - DEA) beside bibliographical items about 120 posters are accessible in digital form.

Arise of the previous things to reveal the characteristic of the posters in the catalogues of libraries and in electronic archives is possible and necessary too. For the descriptions of these documents several kind of metadata systems are available: on the one hand the traditional MARC-format metadata, on the other hand the widespread used Dublin Core metadata service. In both cases the description has to cover the special items, that are only typical of posters, and which allow the precise identification. My aim is to define the requiring items of this special type of documents for the exploration with suiting the metadata used in the DEA, and to compare created metadata system with the FRBR model (Functional Requirements for Bibliographic Records).

Information Retrieval 2

Rédey Gábor Dr. <redeyg@iif.hu>

Országos Atomenergia Hivatal

The necessity of electronic information storage is recognized increasingly today, and accordingly, the electronically accessible textual mass-data is growing continuously. This huge mass of electronically stored information also stimulates data processing, implying among others more efficient information retrieval methods.

Search in extensive texts is an ill-defined problem, i.e. the task turns to be formulated increasingly more exactly during the course of the solution. The person, who makes the search, specifies and circumscribes increasingly more accurately that certain thing, about which he wants to obtain knowledge relating to the text, so he/she invents increasingly more complex and appropriate terms answering the purpose.

The current searching tools are not suitable for performing this successive approximation process until the final border of the opportunities, because the search term they use is only a loose set of words without any interdependence. Interdependences are recognized just by the user who conceived that term, furthermore, those text places, which are found by means of a

loose set of keywords, are often irrelevant in actual practice, even if the search engine ranks these hits somehow. This brutally demolishes search efficiency. After a while human searcher gets tired of constructing newer and newer search terms exhausting his/her own creativity, always hoping better results, while the search engine does not "understand" what it is expected to do, or at most with a moderate success.

The reason of the previously described phenomenon is that the search term is in fact not in the least a loose set of keywords; it solely becomes that in the present search technology. Search terms are those kinds of compound terms which have similar syntactic-logic structures like sentences or sentence elements. This syntactic-logic structure cannot be neglected during search, as the structure of search terms are to be compared directly to the structure of sentence elements in the text.

The paper answers the questions how the syntactic-logic structure of a sentence and a search term can be represented, how the required structure can be generated, and as a consequence of that how the search of compound terms can be performed on the logic representation of texts.

Databank of Hungarians in Slovakia

Konkoly László <konkoly@foruminst.sk>

Fórum Kisebbségkutató Intézet Main fields of the interest of the Institute:

Documentation of the Hungarian and the other Slovakian minority’s cultures; researches;

library; databases; archives; digitalization; internet based databases; organizing of trainings, conferences, workshops; exhibitions; publications. Structure of the Institute:

1. Research of Hungarians, based in Samorin 2. Center of European Ethnology, based in Komarno 3. Center of Interethnic Researches, based in Samorin 4. Library - Bibliotheca Hungarica, based in Samorin 5. Archives of Hungarians in Slovakia, based in Samorin 6. Center of Digitalization, based in Komarno and Samorin Home page of the Institute

Short chronology of the Institute’s home page:

• the beginnings

• static era (HTML)

• the first databases (structural data storing, online uploading of data, researchable databases etc.), PHP, MySQL

• switching to CMS system,

• starting of Hungarian Databank Hungarian Databank in Slovakia

• The Hungarian Databank in Slovakia was entered for the public in December of 2007.

• The former structure of the home page has changed - the databases presenting the outcomes of the Institute’s researches and programs come to the front.

• The databases can be up loadable by the self developed CMS system

• The content of the databank is picturing the wide range of activities of the Forum Institute. The Institute’s mission is the collection, processing, servicing and internet based visualizing of historical, cultural and social information on Hungarians on Slovakia.

• Through the text based contents (library catalogues, bibliographies etc,) the Institute emphasizes to the digitalization of archive documents, photos, old postcards, sound materials etc.

• The main goal of the Databank is to develop the modem digital content, the freshening of existing data. Crucial in this work is to observe the standards and recommendations in the field of digitalization:

- observing of data storing aspects,

- preparing of simple and transparent table structures, - thematic alignment of data and databases,

- using of full text search,

- using of simple and transparent surfaces, - ensuring of quick direct access to data

- The Forum Institute thanks to its outcomes and reputation become to quantitative and qualitative provider of Hungarian Databank in Slovakia.

Structure and Content of the Databank

• Lexicon of Hungarians in (Czecho)Slovakia from 1918 up today,

• Chronology of Hungarians in (Czecho)Slovakia from 1918 up today 3rd set:

• Photo archive of the Forum Institute,

• Post card collection of the Institute,

• Documents, archive materials connected to history of Hungarians in Slovakia,

• Events, Oral History, interviews, broadcast materials,

• Law collection on Slovakian minorities 4th set:

Database of Hungarian institutions in Slovakia, Hungarian publishing houses,

Database of events organized by Hungarians in Slovakia 5th set:

• Database of Slovakian towns and villages The plans

Development and expansion of contents:

• digitalization of archive materials and magazines

• expansion of modem digital contents (photos, written documents, sound materials, internet based leaflets etc.)

• preparing of integral RSS search by involving of online magazines and institutions using RSS

Revising of background program of the Databank:

• The internet technologies and informatics are developing in a very quick way.

Technologies used for example two years ago are nowadays out of date. Thus the main task in 2010 is the transformation of the background system of the portal.

TudásTár, a Nuclear Knowledge Base

Rédey Gábor Dr. <redeyg@iif.hu>

Országos Atomenergia Hivatal

Our Nuclear Knowledge Base project (TudásTár) started two years ago. The nuclear information steakholders: HAEA, Paks NPP, TSOs, University Deparment representatives got together arriving at the decision that a national nuclear knowledge base would be essential in sharing and preserving information and knowledge gathered by designers, researchers and the authority. The practical work on TudásTár project began early last year.

The project started with conceptual and system design. Several known knowledge management systems (like AGI - Information Management Consultants, NDA (National Digital Datastore), MEK (Hungarian Electronic Library), KIKERES (Government Informarion Retrieval System), ExLibris-DigiTool, OAI (Open Archive Initiative) etc.) were studied and software developers were consulted to determine if an existent software package or a self-developed software will be more appropriate to cover the requirements.

Consultations concluded to the resolution that the latter way of realization is to be preferred, as our operational experiences, the uniquness of Hungarian language and our limited financial potential reasonably reduce the supply and our capabilities, however, project costs were reduced by IAEA sponsoring the hardware equipment.

Tasks of project management, document storage and metadata processing were kept indoors, while developing partners were invited to collaborate in Lotus Domino based web interface development (Unioffice) and language technological applications (MorphoLogic).

Now prototyping and testing phase is in progress or nearly closed. Some existing databases like the one of authoritative documents, standards, tutorials and R \& D project documents were selected for processing. A prototype web search interface was developed supporting both full text and metadata search together with visualizing hits with their contexts.

According to characteristics of Hungarian, full text search is based on stem search requiring stemmed indices. At this stage language technological expertise were utilized. The possible and available range of metadata and their storage format (HUNMARC/MARCXML) were determined, and individual processing of the selected databases were performed concluding to the necessary metadata records.

Conclusions are now being made, necessary corrections are to be executed and future co­

operating data-supplier partners are to be involved to the project.

Seek and You will find - or maybe not?

Talyigás Judit <talyigasj(á/nti. hu>

MTI Zrt

Our Internet era is typical of dynamic development of the technological background of the infrastructure and service options, and at the same time the number of users is growing continuously. Because of the new and latest technological developments the storage, accessibility and utilization of an ever increasing data content became possible. We expect the users know what they search for, and although it sounds as a contradiction, they wonder and enjoy the found materials. To put it simple: they can use and utilize them.

It means, that the successful search engine is becoming more and more important. From the more and more boundless quantity and quality of content has a value only if one can find what he/she is eager to know, what he/she really needs.

The news and photo archives of MTI are one of the biggest public collections of Hungary.

The significant part of it, the period between 1987 and 2005, is available for the public via Internet. (Open Archives)

In this presentation the user-friendly interface of these archives will be described pointing out the search supporting system, which, besides its precision pop-up a list with related possibilities. This search system is supported by graphics interfaces to improve the search for a given date, geographical space or thematically. The background of this search utility is a Knowledge Base System, which contains over 10 000 Hungarian, foreign and international organizations with their leaders up-to-date. It can identify artists and celebrities too. The system makes a distinction between persons with the same name. The system also contains the important events. All these features were implemented to support the search system.

The whole system has been operating since January 2007. The few million pieces of news and the search system has been available for the public since December 2009.