Logo Ray's Blog
  • Home
  • About
  • News
  • Publications
  • Education
  • More
    Experiences
  • Posts
  • Notes
  • Dark Theme
    Light Theme
Logo Inverted Logo
  • Posts
  • AI
    • Infrastructure
      • Guides On Choosing Deep Learning Server
    • LLM
      • Asksage
    • PyTorch
      • Learning PyTorch Part I
      • Pytorch Distributed Data Parallel With Model Parallel in an HPC Environment
  • Tools
    • NeoVim
    • An Intro to a CLI Password Management: Pass
    • Exercism Cli Shortcut
    • Random Docker/Podman tips
  • HPC
    • ALCF
      • Distributed Training
      • QWen2.5-VL
  • Linux
    • Manage Users in Linux
    • Setup Ubuntu 22.04
  • Embedded Systems
  • Programming
    • C++
      • C++ Enum Pattern
    • Competitive Programming
      • How to Learn Programming
      • Mistakes I Have Made
      • TopCoder
        • HoleCakeCuts topcoder SRM411 div2 level3
        • InfiniteSequence topcoder SRM413 div2 level3
        • StringsAndTabs topcoder SRM412 div2 level3
        • TeleportsNetwork topcoder SRM409 div2 level3
    • Design Patterns
      • Object-Oriented Analysis
      • Object-Oriented Design Principles
    • Python
      • Python Conditional Timeit Decorator
Hero Image
Deploy Qwen2.5-VL on a Single Node

Overview Qwen2.5-VL 72B is a flagship multimodal large language model, distinguished by its 72 billion parameters and advanced capabilities in vision and language integration. This model excels in a wide range of tasks, including sophisticated visual understanding, robust multilingual OCR, and complex document and video analysis. Unlike its predecessors, Qwen2.5-VL introduces dynamic resolution and temporal video alignment, allowing it to accurately process and summarize long-form videos and pinpoint events with second-level granularity. A key feature is its “agentic” ability, which enables it to act as a visual agent for interactive tasks, such as operating a computer or mobile device based on visual input and instructions. The model also offers precise object grounding with bounding boxes and can generate structured outputs in formats like JSON, making it highly suitable for applications requiring data extraction from tables, forms, and other complex layouts.

    Sunday, September 7, 2025 | 4 minutes Read
    Hero Image
    Distributed Training on ALCF Polaris

    Overview Polaris is a high-performance computing (HPC) system at the Argonne Leadership Computing Facility (ALCF) that provides robust support for distributed training workflows and advanced scientific computing applications. This post will go through how to train a deep learning model in distributed parallel using Hugging Face Accelerate, a library that simplifies distributed training across multiple GPUs and nodes. Prerequisites Before starting this tutorial, ensure you have: ALCF Account: Active account with access to Polaris system Project Allocation: Computing time allocation on a project (you’ll need the project name) MFA Setup: CRYPTOCard or MobilePASS+ token configured for authentication Basic Knowledge: Familiarity with SSH, Linux command line, and Python virtual environments Python Experience: Understanding of deep learning concepts and PyTorch/Transformers What is DeepSpeed? DeepSpeed is Microsoft’s deep learning optimization library that enables efficient distributed training. It provides memory optimization techniques like ZeRO (Zero Redundancy Optimizer) and supports large model training across multiple GPUs and nodes with minimal code changes.

      Saturday, September 6, 2025 | 4 minutes Read
      Hero Image
      Asksage Python API Setup

      AskSage is a secure and extensible generative AI platform designed for government and commercial organizations, with a particular focus on the public sector and regulated industries. It provides a way for teams to leverage various large language models (LLMs) and other AI capabilities in a secure and compliant environment. Key features of AskSage include: Multi-model access: Support for various LLMs including GPT, Claude, Gemini, and specialized government-approved models Enterprise security: SOC 2 compliance, data encryption, and air-gapped deployment options Audit trails: Complete logging and monitoring for regulatory compliance Custom integrations: API access for embedding AI capabilities into existing workflows Content filtering: Built-in safety measures and content moderation A comprehensive example is provided by AskSage here.

        Saturday, September 6, 2025 | 5 minutes Read
        Hero Image
        Docker/Podman tips

        Docker unlimited resources docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -it --rm --privileged --detach-keys "ctrl-a,a" -v $(pwd):/workspace <image-name> Map detach-keys to something else The default detach shortcut is ctrl-p, ctrl-q conflicting the previous command ctrl-p. Adding the option --detach-keys "ctrl-a,a" maps it to something else. Port forwarding -p 9999:9999 Use Host DNS --dns-opt=/etc/resolv.conf Podman Container Device Interface (CDI) From NVIDIA ref sudo dnf clean expire-cache \ && sudo dnf install -y nvidia-container-toolkit-base sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml grep " name:" /etc/cdi/nvidia.yaml It should show the GPU device ids (this example is for two gpus):

        • Documentation
        Saturday, June 10, 2023 | 1 minute Read
        Hero Image
        Manage Users in Linux

        List Users and Groups cat /etc/passwd The file stores all users in the system in :-seperated column format. It has 7 fields starting from <username>, whether password encrypted, user id <uid>, group id <gid>, comment, home directory and shell. cat /etc/group The file contains group name, password, groud id <gid>, and user names in this group. id <username> Shows the user id, group id and groups the person is in.

        • Documentation
        Tuesday, June 6, 2023 | 2 minutes Read
        Hero Image
        An Intro to a CLI Password Management: Pass

        Pass Replace text in <> with your own info. pass Generate a gpg key pair gpg --full-generate-key Enter name and email address To change expire date gpg --edit-key <email>@<address> Once you are in the interactive mode gpg> list gpg> expire gpg> save To change password gpg --passwd <email>@<address> To export public key gpg --export --armor --output public.pgp <email>@<address> To change cache time using gpg agent in secs cd ~/.gnupg echo "default-cache-ttl 86400" > gpg-agent.conf echo "max-cache-ttl 86400" >> gpg-agent.conf Pass Initialize Pass gpg -K # to show key id pass init <key_id> pass git init Add new password pass insert <name> # create a new password pass generate <name> # generate a new password pass list # list passwords pass generate <name>/<sub> # generate a nested password If we replace <name> with github, we will create a password file named github. Each password is stored as a file, like so:

        • Documentation
        Monday, May 29, 2023 | 2 minutes Read
        Hero Image
        Setup Ubuntu 22.04

        install packages sudo apt update && sudo apt upgrade -y sudo apt install -y zsh git i3 feh scrot pavucontrol i3blocks tree curl htop picom sudo apt install cmake pkg-config libfreetype6-dev libfontconfig1-dev libxcb-xfixes0-dev libxkbcommon-dev clang install zsh and oh-my-zsh <!-- sudo apt install zsh-autosuggestions zsh-syntax-highlighting --> sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)" git clone https://github.com/zsh-users/zsh-syntax-highlighting.git ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-syntax-highlighting git clone https://github.com/zsh-users/zsh-autosuggestions ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-autosuggestions install nvim https://github.com/neovim/neovim/releases/tag/v0.7.0 git clone https://github.com/YHRen/nvim_config sudo apt install build-essential llvm xclip follow the instruction in nvim_config.

        • Documentation
        Monday, May 2, 2022 | 2 minutes Read
        Hero Image
        Exercism Cli Shortcut

        A Exercism CLI shortcut Exercism is a good place to learn and practice coding. It features a variety of programming languages (tracks) ranging from popular ones like python and C++, newer ones like Rust, Go and Clojure, to exquisite ones like vimscript. Each track consists of a series exercises that one need to challenge them in sequence. Some tracks feature real person mentors, which provides feedback and (dis)approves your solution! Yep, the next exercise would be unlocked only if your solution had been approved by the mentor, which may take some iterations and discussions. Most importantly, Exercism comes with a command line interface (CLI) that is very convenient to use.

          Thursday, July 9, 2020 | 3 minutes Read
          Hero Image
          Pytorch Distributed Data Parallel With Model Parallel in an HPC Environment

          Distributed Data Parallel with Model Parallel in an HPC environment Objective This tutorial is on : how to separate a model and put it on multiple GPUs. how to train such model in a distributed data parallel fashion. how to use torch.distributed.launch and create a slurm job script for HPC environment. Model Parallel (Pipelining) When a model is too large to fit in one GPU device, we can cut it in half and put each part on different GPU device. To do this, we need to partition the model into “head” and “tail” and specify which device to put them on. In the following toy example, we simply put the first part in to current GPU device and the second part to the next device.

          • Documentation
          Thursday, December 12, 2019 | 5 minutes Read
          Hero Image
          Python Conditional Timeit Decorator

          Introduction The conditional timeit decorator will provide a convient way to measure the time spent on individual functions. The behavior of the timer will depend on the verbosity flag such that: python main.py the program will run quitely python main.py -v the program will report the progress python main.py -vv the program will report the timing for individual functions it contains. Timeit Decorator Python decorator changes the default behavior of the wrapped function.

            Wednesday, June 19, 2019 | 2 minutes Read
            Hero Image
            Guides On Choosing Deep Learning Server

            Introduction Choosing the right GPU server for deep learning is the first problem presented to the research teams among industry and academia. This article is to introduce a few tips in picking the right hardware for your team. If the purpose of the server is mainly for development, a RTX server would be the most cost effective. It is for production, namely to go though TB to PB of data, it is better to use high-end scalable servers. So one can train a single model in parallel efficiently.

              Monday, June 10, 2019 | 3 minutes Read
              Hero Image
              C++ Enum Pattern

              Introduction Very often we want to define a set of items to choose from, for example, a set of colors. In C++, we usually can declare enum class Color { Red, Green, Blue }. However, besides using it in a switch statement, we can hardly use it for anything else. For example, if we want to print "Red" for Color::Red, we have to write another function using switch statement, somewhere else. When we want to add a new color, we have to change every places we are using switch. This is obviously an anti-pattern. Occasionally, we want to find how many colors in total, and maybe even want to iterate through them. None of these are supported by C++ enum. In this blog, I want to introduce the Enum Pattern in C++, which supports switch and constains encapsulated member functions, similar to the Enum Class in Java. The idea came from this stackoverflow post.

                Wednesday, January 3, 2018 | 3 minutes Read
                • ««
                • «
                • 1
                • 2
                • »
                • »»
                Navigation
                • About
                • News
                • Publications
                • Education
                • Experiences
                Contact me:
                • yren@bnl.gov
                • yhren
                • Yihui (Ray) Ren

                Liability Notice: This blog is for informational and educational purposes only. The content provided here represents personal opinions and is not intended as professional advice. Readers should not rely solely on this information and are responsible for their own actions and decisions. This blog is not liable for any damages or consequences resulting from the use of its content. The views expressed here are my own and do not reflect those of my employer or any funding agencies. © 2017-2025 Yihui Ren. All rights reserved.


                Toha Theme Logo Toha
                © 2017-2025 Yihui Ren. All rights reserved.
                Powered by Hugo Logo