HPC user support is a collective term for those activities of the HPC Competence Centre which provide practical technical assistance to the users or would-be users of our supercomputers. Thus, HPC user support does not cover general HPC education, training and courses, which constitute a separate branch of the activities of the Competence Centre.
User support is of vital importance because supercomputers are extremely complex technical systems, and efficient uses thereof require a number of special skills. A significant number of users are not IT professionals, and therefore cannot be expected to have all the skills and experience necessary for a successful use.
This is where user support plays its role: it helps users in all phases – from the design of the application project, through the selection of suitable solutions and implementation, to successful runs. Users can thus focus on their professional, scientific tasks instead of being submerged into the details of the use of HPC as a tool.
In line with international experience, the KIFÜ HPC Competence Centre defines 5 levels that can be resorted to according to the user’s level of expectations. Towards the peak of the pyramid, HPC CC offers higher level services requiring increasingly specialised knowledge.
Levels 1 and 2 (basic levels of support) include assistance related to requesting HPC resources, to physical access, to the use of the basic level functions, as well as to case management and administration.
Level 3 (standard user support) provides answers to the most frequently asked questions arising during the use of the supercomputing resources, and solutions to potential technical problems.
Level 4 and Level 5 services (special services) usually involve application field-specific, complex support requiring highly advanced expertise. These services are provided by the specialists of the HPC Competence Centre (HLST – High Level Support Team), who may work in close cooperation with users for a prolonged time if necessary.
Level 1 – General (non-technical) issue management
These services are not closely related to the actual use of the supercomputers, that is, to the computation work. These include all issues arising before or during actual use.
First and foremost: contact and inform future users about the available supercomputer resources and about the way of accessing them. We try to channel the application process to the most appropriate direction, practically depending on the computational method used, for example, when the software to be used requires special architecture or can effectively use accelerator technologies.
For project applications, the evaluation thereof, and other administrative functions, user interface is provided by the HPC Portal. Upon receiving a new application, projects are approved in a multi-step process, and we may occasionally ask for a revision depending on the quality of the application. Requests for machine time are also submitted here, and are received by our bug ticket management system; evaluation and feedback to users about the outcome of the evaluation also occur through this system.
Potential partners: all those wishing to use the HPC services of KIFÜ and having the appropriate authorisation.
Competences required for the services: general knowledge of KIFÜ and HPC CC case management, and basic level HPC and IT skills.
Level 2 – Technical support
A central element of our support activities is to manage technical issues typically arising at a high frequency during use.
Sources of errors include login difficulties, generating encrypted keys and configuring the authorisation settings thereof, and using an improper key pair or login point. Runtime problems may also arise during actual use, but the reasons thereof may vary. Possible factors include improper running parameters, incorrectly installed software, improperly selected module environment, defective writing and reading paths, and authorisation and access problems. Poorly estimated or non-requested machine times are also sources of errors. Depending on the cooperativeness of the user, issues entered into the OTRS bug ticket system are handled as high priority and with short waiting times.
The presence of correctly installed scientific software programs is one of the fundamental conditions of using supercomputers effectively. The staff of the Competence Centre maintains an updated list of the most commonly used, open-source, scientific software programs even without receiving any special requests, and they support the use of such programs by ensuring a module environment, descriptions of the runs, and sample submission scripts, and also try to facilitate the use thereof.
Potential partners: Any applicant having received a positive evaluation and wishing to commence actual work using KIFÜ’s HPC infrastructure.
Competences required for the services: general and KIFÜ-specific AAI sills, and user-level knowledge of the HPC infrastructure and services.
Level 3 – Standard user support
Daily support for active users is one of the most demanded services. This level is already organically integrated into the day-to-day work with supercomputers, and extends the scope of Level 2 support in effect. The primary focus of Level 3 is to help the work of users who have obtained a project number and authorisation, and to investigate and resolve potential problems.
When working with supercomputers, the types of support tasks depend on the field of science involved, and on the needs and skills of the user. These include correcting and amending the submission scripts used for the SLURM Resource Management and Job Scheduler System, and providing guidance on resource use optimisation. Level 3 also includes support for the installation and compilation of new software or software with a higher version number, investigating and resolving potential problems arising when running the installed software, as well as responding to questions about resource and software use. In addition, this level is responsible for identifying inadequate resource uses and for informing the users thereof.
The bug reporting site of the HPC Portal is of vital importance in this context; however, other means of communication such as e-mail, telephone, or OTRS bug tickets are also available.
Potential partners: Users actively working with the supercomputers and facing simpler or more complex technical problems in the course of their work.
Competences required for the services: High-level HPC operating skills, sometimes even interdisciplinary IT skills, and a detailed knowledge of the HPC infrastructure and software environments of KIFÜ.
Level 4 – Consulting services
In the framework of the consultancy services, KIFÜ assists an extremely diverse range of HPC-related questions. This involves cooperation with potential users in designing their projects, in choosing one or more appropriate HPC solutions, in planning resource needs, and in obtaining access to national or international HPC resources. Users can also rely on us in their quests for Hungarian or international partner(s), and in identifying and acquiring the missing competences (also through technology transfer or special trainings). Upon request, we also provide help with R&D grant applications by developing the technical (HPC) contents thereof.
The fundamental objectives of these services include providing complex assistance for HPC applications in previously uncovered fields and for previously uncovered players, and encouraging the involvement of as many Hungarian players as possible in the scientific and innovative activities related to the HPC technology, and as successfully as possible.
Potential partners: market players and academic users, representatives of highly diverse fields of application, and IT companies interested in the development of HPC technologies.
Competences required for the services: Highly advanced professional skills, knowledge of various fields of application, a detailed knowledge of the functioning of the HPC ecosystem, and international relations.
Level 5 – Application development and implementation services
These services are only available in case of very special and carefully justified needs, and in practice involve a steady and long-term (months- or years-long) close cooperation with the affected user. Users may also ask for the development of customised HPC solutions (software, algorithm) suitable for pre-defined objectives, or for a contribution to such developments.
These cover the development and implementation of completely new applications or algorithms, and the modification, improvement, optimisation, porting, and performance-tuning of existing algorithms.
KIFÜ is highly reliant on external experts and on the professional cooperation programs with universities in this field, and may undertake such tasks based on individual considerations, and in limited numbers depending on the available competences and free resources. For Hungary, this is extremely important in terms of being included in the list of countries pursuing significant HPC technology development given that all major national HPC centres provide similar services.
Potential partners: Academic or market players having innovation potential and ambitions, that is, wishing to use supercomputers in an innovative way or for innovative tasks.
Competences required for the services: Highly advanced and specialised IT development and HPC skills, and HPC development experience.