Repository Community: null

Repository Community: null https://dspace.hansung.ac.kr/handle/2024.oak/1931 2026-05-14T16:24:11Z 다시점 영상에서 객체 추적 및 세그멘테이션을 위한 프레임워크 https://dspace.hansung.ac.kr/handle/2024.oak/10671 Title: 다시점 영상에서 객체 추적 및 세그멘테이션을 위한 프레임워크 Author(s): 용지현 Abstract: 단일 카메라 기반 영상은 시야 제한과 객체 가림(Occlusion)으로 인해 복잡한 장면에서의 객체 인식 및 추적 성능이 제한된다. 다시점 영상은 이러한 한계를 완화할 수 있으나, 한 프레임에서 처리해야 하는 이미지 수가 증가함에 따라 실시간 객체 추적 및 세그멘테이션에서는 높은 계산 복잡도와 방대한 데이터를 처리해야 하는 문제가 발생한다. 본 논문에서는 다시점 영상의 대표적 사례인 플렌옵틱 이미지와 멀티 카메라 이미지를 연구 대상으로 활용하여 이에 적합한 효율적인 객체 추적 및 세그멘테이션 프레임워크를 제안한다. 먼저, 플렌옵틱 이미지 환경에서는 기존 2D 비디오 기반 객체 추적기를 플렌옵틱 구조에 적합하도록 재구성하였다. 또한, 한 프레임 내 다수의 포컬 플레인 이미지 중 필수 정보만을 선별하는 포컬 플레인 이미지 선택 전략을 도입하고, 프레임워크 내부의 딥러닝 기반 특징 추출 모듈과 전처리 단계를 멀티코어로 구성된 CPU와 GPU 환경에서 병렬화하여 계산 효율을 극대화하였다. 또한 본 논문에서는 멀티 카메라 이미지 환경에서는 low-rank projection matrix를 적용한 경량화된 Video Multi-Object Segmenter와 경량화된 Mask refiner를 원본 모델과 동적으로 조합하여 사용하는 효율적인 세그멘테이션 프레임워크를 제안한다. 연속된 프레임 간 코사인 유사도를 적용하여 현재 프레임의 이미지들의 경량화 정도를 적응적으로 조정함으로써, 더욱 fine-grained한 모델 적용을 가능하게 한다. 이때 다중 GPU 환경에서는 경량화 모델들과 원본 모델들이 혼재되어 수행된다. 이로 인해 발생하는 GPU 간 실행 시간 불균형은 프레임 단위 지연을 초래할 수 있다. 이를 해결하기 위하여 매 프레임마다 시스템 내부 GPU들의 하드웨어적인 연결 상태를 고려하여 GPU 간 데이터 이동을 최적으로 수행한다. 이를 통해 각 GPU의 프레임당 세그멘테이션 실행 시간을 균형적으로 유지함으로써, 전체 시스템의 평균 프레임 실행 시간을 최소화하도록 설계하였다. 실험 결과, 제안한 플렌옵틱 이미지 기반 추적 프레임워크는 기존 대비 81.7%의 실행 시간을 단축하였으며, 멀티 카메라 이미지 기반 세그멘테이션 프레임워크는 경량 모델 사용에 따른 IoU 감소를 2.86% 이내로 유지하면서 프레임당 실행 시간을 34.3% 절감하였다. 【주요어】플렌옵틱, 스레드 풀, 멀티 스트림, 멀티뷰, low-rank 근사, 적응형 GPU 부하 재분배|Single-camera-based video suffers from limited object recognition and tracking performance in complex scenes due to restricted field of view and occlusion. Multi-view video can mitigate these limitations, but the increased number of images to process per frame leads to high computational complexity and the need to handle massive data volumes in real-time object tracking and segmentation. This paper proposes efficient object tracking and segmentation frameworks for two representative types of multi-view video: plenoptic imaging and multi-camera imaging. First, in the plenoptic imaging setting, we restructure existing 2D video-based object trackers to better align with the characteristics of plenoptic images. In addition, we introduce an image selection strategy that extracts only the essential focal plane images from the numerous ones available in each frame, and we maximize computational efficiency by parallelizing the deep learning-based feature extraction module and preprocessing stages across a multi-core CPU and GPU environment. Second, in the multi-camera imaging setting, we propose an efficient segmentation framework that dynamically switches between lightweight and original models, and uses Video Multi-Object Segmenter with a low-rank projection matrix and a lightweight Mask Refiner. Furthermore, cosine similarity between consecutive frames is used to accurately determine the extent of motion or variation of target objects. This information enables adaptive adjustment of the lightweight level for the current frame, allowing fine-grained model selection. In a multi-GPU setting, the coexistence of lightweight and original models can lead to execution time imbalance across GPUs, causing frame-level latency. To mitigate this, the proposed framework optimally manages inter-GPU data transfers at each frame by considering the hardware connectivity of GPUs. As a result, the segmentation execution time per frame is balanced across GPUs, minimizing the overall average per-frame execution time. Experimental results demonstrate that the proposed framework maintains the IoU drop within 2.86% due to lightweight model usage, while achieving a 34.3% reduction in average per-frame execution time. 【Keywords】Plenoptic, Thread pool, Multi-stream, Multi-view, low-rank approximation, Adaptive GPU load redistribution 2025-12-31T15:00:00Z 다중 딥러닝 모델 실행 효율화를 위한 경량화 기법 및 블록 레벨 스케줄링 연구 https://dspace.hansung.ac.kr/handle/2024.oak/10669 Title: 다중 딥러닝 모델 실행 효율화를 위한 경량화 기법 및 블록 레벨 스케줄링 연구 Author(s): 김혁수 Abstract: 임베디드 환경에서 다중 딥러닝 모델을 동시에 수행할 경우, 모델 간 자원 경쟁으로 인해 실행 지연이 발생하며 이는 지연 시간에 민감한 시스템에서 치명적인 성능 저하로 이어질 수 있다. 이러한 문제를 완화하기 위해서는 모델 경량화 기법을 적용하는 것이 필수적이며, 대표적으로 Quantization과 Pruning이 이에 해당한다. 그러나 이들 기법의 적용 사례는 대부분 분류 모델에 집중되어 있어, 검출 및 추적 모델에 동일한 경량화 기법을 적용한 연구는 상대적으로 드물다. 이를 확장하기 위해 본 논문은 효율적인 DNN 모델을 실행하기 위해 두 가지 측면에서 접근한다. 첫째, 서버 시스템 환경에서 다양한 시각 지능 모델을 대상으로 Pruning과 Quantization을 단독 또는 조합하여 모델 크기, 파라미터 수, 정확도 변화를 체계적으로 분석하였다. 실험 결과, 두 경량화 기법을 같이 적용했을 때 일부 모델에서 더 높은 파라미터 감소 효과를 보이면서도 정확도 손실을 최소화하였다. 또한, 모델 구조 변경 없이 추론 과정을 최적화하는 TensorRT 기반 런타임 최적화 기법을 적용하여, 추가적인 연산 그래프 최적화와 커널 융합을 통해 기존 경량화 기법만으로는 확보하기 어려운 추론 속도 향상과 메모리 효율 개선 효과를 확인하였다. 둘째, 임베디드 환경에서 다중 DNN 작업을 동시에 수행할 때 실행 지연을 최소화하기 위해 블록 단위 동적 스케줄링 및 블록 수준 동적 전환 기법을 제안한다. 해당 기법은 모델을 기능적 단위인 블록으로 분할하여 실행 단위로 구성하고, 병렬 실행 시 오히려 지연을 유발하는 블록을 식별하여 순차 실행으로 전환한다. 또한 각 블록의 실행 지연 정도를 정량화하는 지표인 LAG를 활용해, 지연이 크게 예상되는 블록을 런타임에 경량화된 블록으로 대체하여 지연 시간과 정확도 간의 균형을 실시간으로 유지한다. 대표적인 임베디드 환경인 NVIDIA AGX Jetson Xavier 보드에서 이질적인 다중 DNN을 동시에 실행한 실험 결과, 제안 기법은 최대 29.3%의 지연 시간 감소와 기준 정확도의 90% 이상 유지할 수 있는 성능을 달성하였다. 【주요어】임베디드 딥러닝, LAG, EMA, 모델 압축, 블록 교체, 다중 DNN 스케줄링|In embedded environments, running multiple deep learning models concurrently can lead to execution delays due to resource contention among models, ultimately causing severe performance degradation in latency-sensitive systems. To mitigate this issue, applying model lightweighting techniques becomes essential, with quantization and pruning being the most representative approaches. However, the application of these techniques has been largely limited to classification models, and it is relatively uncommon to apply the same lightweighting methods to detection and tracking models. Therefore, this paper addresses efficient DNN execution from two complementary perspectives. First, in a server-based environment, pruning and quantization were applied individually and in combination to various vision models to analyze changes in model size, parameter count, and accuracy. Experimental results show that combining the two lightweighting techniques yields greater parameter reduction than using either technique alone, while keeping accuracy degradation minimal. Second, to minimize execution delays in multi-DNN concurrent environments, we propose a block-level dynamic scheduling and block-level dynamic replacement technique. In this method, each model is divided into functional units called blocks, which serve as the fundamental execution units. The scheduler identifies blocks that cause additional latency when executed in parallel and selectively switches them to sequential execution. Moreover, using a metric that quantifies the execution delay of each block, blocks expected to incur significant latency are dynamically replaced at runtime with lightweight alternatives to maintain a real-time balance between latency and accuracy. Experiments conducted on a representative embedded platform, the NVIDIA AGX Jetson Xavier, show that the proposed method achieves up to a 29.3% reduction in latency while preserving more than 90% of the baseline accuracy when executing heterogeneous DNNs simultaneously. 【Keywords】Embedded Deep Learning, LAG, EMA, Model Compression, Block Switching, Multi-DNN Scheduling 2025-12-31T15:00:00Z 1차원 신호 분석 신뢰성 향상을 위한 인공지능 기반 전처리 및 데이터 증강 기법 연구 https://dspace.hansung.ac.kr/handle/2024.oak/10634 Title: 1차원 신호 분석 신뢰성 향상을 위한 인공지능 기반 전처리 및 데이터 증강 기법 연구 Author(s): 이주형 Abstract: 1차원 신호는 다양한 분야에서 물질 조성, 반응 상태, 이상 여부 등을 판별하는 핵심 정보원으로 활용된다. 그러나 실제 계측 환경에서는 낮은 신호대잡음비(Signal-to-Noise Ratio)와 제한된 계측 조건으로 인해 신호가 심하게 오염되거나 계측 조건 전 구간을 포괄하는 고품질 학습 데이터셋을 확보하기 어려운 경우가 많다. 특히 피크의 위치, 높이, 폭과 같은 형태학적 특징이 정량 분석의 중심 지표임에도 불구하고 기존 잡음 제거 및 데이터 증강 기법은 피크 구조를 충분히 보존하지 못하거나 조건 변화 및 성분 간 상호작용에 따라 비선형적으로 변화하는 파형 특성을 적절히 반영하지 못해 딥러닝 기반 분석 모델의 신뢰도와 일반화 성능을 제약하는 한계를 드러낸다. 이러한 한계를 해결하기 위해 본 논문에서는 피크 기반 1차원 신호의 신호 품질 향상과 데이터 부족 완화를 동시에 달성하기 위한 두 가지 방법론을 제안한다. 첫째, 피크 구조를 명시적으로 고려하는 피크 인지형(peak-aware) 적응적 잡음 제거 기법을 제안한다. 잡음이 포함된 1차원 신호에서 피크를 자동 검출한 뒤 피크 간 거리와 기준 신호와의 유클리드 거리(Euclidean Distance, ED)를 기반으로 구간별 최적 Savitzky–Golay(SG) 필터 윈도우 길이를 회귀 모델 형태로 학습함으로써 신호 전체에 하나의 고정 윈도우를 적용하는 기존 방식의 한계를 극복하였다. 라만 분광 신호를 대상으로 한 실험 결과 제안 기법은 고정 윈도우 SG 대비 ED를 약 28% 감소시키고 Fréchet Inception Distance(FID)를 약 48% 개선하여 피크 형상 보존과 잡음 억제 성능을 동시에 향상시키는 것으로 나타났다. 둘째, 극히 제한된 수의 실측 혼합 신호만으로도 비선형 혼합 상호작용을 반영한 고품질 1차원 혼합 신호를 생성하기 위한 비선형 혼합 신호 생성 프레임워크를 제안한다. 제안 프레임워크의 핵심은 혼합 비율 조건 벡터를 입력으로 받는 조건부 변분 오토인코더(Conditional Variational Autoencoder)를 기반으로 서로 다른 혼합 비율에서 계측된 소수의 실측 혼합 신호를 앵커로 활용하여 목표 혼합 비율 주변에서 동적 보간을 수행함으로써 순수 성분 신호의 단순 선형 결합만으로는 포착하기 어려운 비선형 혼합 궤적을 학습하게 하는 데 있다. 또한 피크 위치와 강도 차이를 직접 규제하는 피크 손실(peak loss)을 도입하여 평균 제곱 오차(Mean Squared Error, MSE) 기반 손실만으로는 포착되기 어려운 좁은 피크 영역의 국소 오류를 효과적으로 억제하였다. 이성분 라만 혼합계를 대상으로 한 실험에서 제안 프레임워크는 단순 선형 결합 방식 대비 MSE를 약 69.2% 감소시키고 FID를 약 84.6% 개선하였으며 피어슨 상관계수(Pearson Correlation Coefficient) 약 0.999를 달성하여 실제 계측 신호와 거의 구별이 어려운 수준의 신호 생성 성능을 보였다. 종합적으로, 본 논문에서 제안하는 피크 인지형 적응적 잡음 제거와 비선형 혼합 신호 생성 프레임워크는 피크 구조를 중심으로 한 도메인 지식을 모델 설계에 직접 통합함으로써 신호 품질 저하와 데이터 부족이라는 두 문제를 동시에 완화하는 통합적 해법을 제공한다. 2025-12-31T15:00:00Z 신경망 기반 자동화 접근을 통한 1차원 신호 데이터 처리 시스템 https://dspace.hansung.ac.kr/handle/2024.oak/10574 Title: 신경망 기반 자동화 접근을 통한 1차원 신호 데이터 처리 시스템 Author(s): 김예란 Abstract: 1차원 신호 데이터는 분광학, 화학 분석, 생체 신호 모니터링 등 광범위한 과학 및 산업 분야에서 필수적인 데이터 유형이다. 그러나 다양한 도메인의 1차원 신호 측정 과정에서 공통적으로 발생하는 비선형적인 기저선 변동(baseline drift)과 배경 잡음(background noise)은 데이터의 정량적 해석을 어렵게 만드는 주요 요인이다. 기존 기법들은 수동 튜닝에 의존하거나 일반화 성능이 부족하다는 한계가 있어, 본 연구는 이를 극복하기 위한 범용적인 자동화 처리 시스템을 제안한다. 또한, 제안된 시스템을 실증적으로 검증하기 위해 대표적인 1차원 신호인 라만 스펙트럼(Raman spectrum) 데이터를 통해 그 유효성을 검증한다. 본 논문은 두 가지 상호 보완적인 접근법을 제시한다. 첫째, 연산 효율성을 위해 제안된 APON-BC(Automated Parameter Optimization Network for Baseline Correction)는 신경망을 통해 최적의 알고리즘 매개변수를 자동으로 예측하는 하이브리드 시스템으로, 리소스가 제한된 환경에서 신속한 처리를 가능하게 한다. 실험 결과, APON-BC는 0.002 GFLOPs의 낮은 연산량으로 기존 기법 대비 FID와 ED를 각각 84.71%, 64.65% 감소시키며 탁월한 효율성과 안정성을 입증하였다. 둘째, 정밀한 복원을 위해 제안된 BC-Former(Baseline Correction with Transformer)는 트랜스포머(Transformer) 기반의 End-to-End 딥러닝 모델로, 현실적인 합성 데이터와 동적 손실 가중치 전략을 통해 복잡한 비선형 기저선을 효과적으로 보정한다. BC-Former는 CS와 PCC가 0.998에 도달하고 FID와 ED를 각각 0.037, 0.286으로 최소화하는 등 타의 추종을 불허하는 정밀도와 보지 못한(unseen) 데이터에 대한 적응력을 보였다. 결론적으로, 본 연구는 효율성과 정밀도 간의 상충 관계를 해결할 뿐만 아니라, 리소스가 제한된 임베디드 장치부터 고성능 컴퓨팅 환경까지 아우르는 폭넓은 운영 환경에 최적화된 이원화된 솔루션을 제공한다. 또한, 하이브리드 접근 방식에서 완전한 End-to-End 딥러닝 프레임워크로의 방법론적 진보를 제시함으로써, 다양한 1차원 신호 데이터 분석의 완전 자동화와 신뢰성을 확립하는 새로운 표준을 마련하였다.|One-dimensional (1D) signal data constitutes a fundamental data type across numerous scientific and industrial disciplines, including spectroscopy, chemical analysis, and biosignal monitoring. However, regardless of the specific domain, non-stationary baseline drift and background noise inherent to the measurement process significantly impede accurate quantitative interpretation. Existing techniques face limitations such as reliance on manual tuning or insufficient generalization performance. To address these challenges, this study proposes a generalized automated processing system. Furthermore, to empirically validate the proposed system, Raman spectral data, a representative form of 1D signals, is utilized to verify its effectiveness. This paper presents two complementary approaches. First, Automated Parameter Optimization Network for Baseline Correction (APON-BC), proposed for computational efficiency, is a hybrid system that automatically predicts optimal algorithmic parameters using a neural network, enabling rapid processing in resource-constrained environments. Experimental results demonstrate that APON-BC achieves exceptional efficiency and stability, reducing FID and ED by 84.71% and 64.65%, respectively, compared to existing techniques, with a remarkably low computational cost of 0.002 GFLOPs. Second, Baseline Correction with Transformer (BC-Former), proposed for precise reconstruction, is a Transformer-based end-to-end deep learning model. It effectively corrects complex nonlinear baselines by leveraging realistic synthetic data and a Dynamic Loss Weighting strategy. BC-Former exhibits unrivaled precision and adaptability to unseen data, achieving CS and PCC scores of 0.998 and minimizing FID and ED to 0.037 and 0.286, respectively. Consequently, this study not only resolves the inherent trade-off between computational efficiency and reconstruction precision but also offers a comprehensive solution tailored to a wide spectrum of operational environments, ranging from resource-constrained embedded devices to high-performance computing scenarios. By demonstrating the methodological evolution from a hybrid parameter-prediction approach to a fully end-to-end deep learning framework, this research establishes a new standard for the robust automation and reliability of 1D signal data analysis. 2025-12-31T15:00:00Z