Welcome to NATP 2026

12th International Conference on Natural Language Processing (NATP 2026)

February 27 ~ 28, 2026, Vancouver, Canada



Accepted Papers
CYBERSECURITY AWARENESS AMONG STUDENTS: A COMPARATIVE REVIEW OF RECENT STUDIES

Grace Llego1 Jim Alves-Foss2 1,2 Center for Secure and Dependable Systems, University of Idaho. Moscow, ID USA,

ABSTRACT

Since about 2010, there has been increased attention in gauging and improving digital citizenship across the world. One important aspect of digital citizenship is cybersecurity awareness. This paper reports on a review of 35 studies that have been conducted in this area, primarily consisting of surveys of students. The primary purpose of this review was to compare and evaluate the methods used in these studies to provide guidance towards future studies. The studies surveyed from 32 to over 3000 participants evaluated cybersecurity awareness and/or practice, with about half of them providing copies of the survey questions to allow for follow-on comparative studies.

Keywords

Cyber security awareness, Surveys. Digital Literacy


DuckDB Performance- Loading and Processing Data from Different File Formats

Jo˜ao Vicente Markovicz Martins, Jonathan Santos da Silva, Rodrigo Ribeiro Barbosa da Silva, Giovanna Vict´oria Souza Venier, and Paulo Jorge Matos1 Polytechnic Institute of Bragan¸ca (IPB) School of Technology and Management, Bragan¸ca, PT

ABSTRACT

The landscape of data analytics is shifting towards high-performance local processing, necessitating efficient tools for handling substantial datasets. This paper evaluates the performance of DuckDB,an embedded Online Analytical Processing (OLAP) database, specifically focusing on its interaction with different file formats. We conduct a comparative analysis of data loading times, query execution speeds, and sorting efficiency across three widely used formats: Apache Parquet, CSV, and JSON. While row-oriented formats like CSV and JSON are ubiquitous for data interchange, our benchmarks demonstrate that they impose significant I/O overhead in analytical pipelines. Conversely, the results highlight DuckDB’s optimized handling of columnar formats, showing that Parquet offers superior performance in both ingestion and query latency. These findings reinforce the importance of format selection in local analytical workflows and validate DuckDB’s suitability for modern data engineering tasks..

Keywords

DuckDB, data processing, Parquet, CSV, JSON, performance benchmarking, file formats


AN INNOVATIVE ANALYSIS OF PARTIAL PENALTY ON IMBALANCED DATA IN CREDIT CARD FRAUD PREDICTION

Jiawei Zhang2 Xin Zhang3 Xinyin Mia1 Senior Investment Analyst, PRA Group (Nasdaq: PRAA), Norfolk, Virginia, USA 2 Data Scientist, PRA Group (Nasdaq: PRAA), Norfolk, Virginia, USA3 Senior Data Analyst, American Airlines Group Inc (Nasdaq: AAL), Dallas, Texas,USA

ABSTRACT

This paper provides an innovative methodology of partial penalty on machine learning models to handle the data imbalance scenario occurring in credit card fraud detection implementation. Unlike the normal over-sampling or under-sampling methodologies, partial penalty directs the machine learning model to focus on learning the minor class of target variable even when the class distribution is extremely imbalanced. Besides comparing the partial penalty approach with over-sampling and under-sampling approaches to handle data imbalance scenario, we’ve implemented this new approach under five machine learning classification models, including Logistic Regression, Random Forest, kNN, Decision Tree, and Light Gradient Boosting Model. The new partial penalty approach realizes a performance of 88.35% F1 score and 98.79% AUC score with Light GBM, higher than either over-sampling or under-sampling approaches.

Keywords

Partial Penalty, Gradient Boosting, Data Imbalance, Credit Card Fraud Detection, SMOTE


A MIXED-PRECISION RISC-V PIPELINE WITH HARDWARE-MANAGED PRECISION SELECTION FOR MATRIX MULTIPLICATION

Ashwin Geeni 1, Independent Researcher, USA

ABSTRACT

AI and deep learning workloads rely heavily on matrix multiplication, which demands both high speed and numerical accuracy. Traditional processors use fixed arithmetic precision, forcing a global trade-off between performance and accuracy. This paper presents the Mixed Precision Pipeline (MPP), a five-stage RISC-V processor pipeline that dynamically switches between INT8, FP16, and FP32 precisions based on runtime numerical stability analysis..

Keywords

Mixed-precision computing, RISC-V, Matrix multiplication, Hardware acceleration, Numerical stability


Multirate CNN Architectures for Deep Learning

Guowei Xiao 1, Ping Wang21 Faculty of Automation Dept, Guangdong University of Technology, Guangzhou, 510006,China 2 Jinqi,Micro Electronics,China

ABSTRACT

This paper proposes the multirate Convolutional Neural Networks (mCNN) algorithms for an efficient implementation of the 2-Dimensional (2-D) CNN circuits implementation. During the rapid growth in computation power, Deep Learning (DL) using CNN has widened the areas of the Artificial Intelligent (AI) applications. For the layers of the convolution with pooling operation in CNN the work (Franca et al., 1985) has initially applied the multirate algorithms[1-3] to the traditional (non-multirate) convolutional kernel operation of using polyphase architectures resulting in the more efficient implementation of the multirate filtering. In this work we extend it into 2-D CNN by using time-varying coefficient to achieve an efficient implementation with reduced memory(i.e. the line-buffer) size by M-fold(the pooling factor) and the MACs at 1/M of clock running rate. A design example of the first stage of CNN system will be provided. Its results are verified with the Matlab CNN-based digit recognition tool.

Keywords

CNN, ML, DL, AI, IC, Multirate, 2-D, Signal Processing,DSP, AISC, Filter


An External Emergence-Stabilization Layer for Cloud and IoT Systems

Noriyuki Suzuki , Nayuta Spiral Works (Independent Research), Japan

ABSTRACT

Modern cloud and IoT systems increasingly exhibit unstable behaviors caused by non- synchronous data flows, latency fluctuations, and noise amplification across distributed components. Existing stabilization techniques mainly rely on modifying internal system logic or applying domain-specific heuristics, making them difficult to generalize across heterogeneous platforms. This paper introduces an External Emergence-Stabilization Layer (EESL)—a model-agnostic, black-box layer designed to regulate emergent behaviors in distributed systems without requiring internal modifications. EESL operates by monitoring high-level system dynamics and guiding them toward phase-aligned, low-divergence trajectories. We show that many forms of instability in cloud, IoT, and AI systems share a common structure expressed as divergence in behavioral phase, and that stabilizing this phase difference leads to consistent system output. We evaluate EESL under a range of destabilization scenarios, including non-synchronous IoT sensing, latency-induced cloud drift, and perturbed generative behaviors. Across all settings, EESL reduces divergence, shortens recovery time, and improves operational consistency without domain-specific tuning. The results suggest that emergence-level stabilization is a viable and generalizable engineering approach for future distributed and intelligent systems. .