Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on Why is this symlink() call returning successfully while apparently failing to create the sym-link?

Post

Why is this symlink() call returning successfully while apparently failing to create the sym-link?

+6
−0

Summary

I'm building an internal system (hardware simulator, using Rust) to help test some Python-based services that talk to hardware. (The services talk to hardware via TTYs.) To trick the Python services into "believing" they're talking to the hardware they expect, I create some PTYs, where the master side is used by my simulator and the slave side of the PTY is given over to the Python-based service.

Since the Python service looks for its PTY by using a specific name, I create a symbolic link that points back to the slave PTY mentioned above, but using the name the Python service expects --or so I thought.

The problem is that the symlink function reports it's being successful, but no link actually exists in the file system. I'm not yet sure what I may be missing.

My question is: Does anyone know why this is happening and how to fix it?

Details

I'm using the std::os::unix::fs::symlink function to create the sym-link. The function's documentation is clear about its usage, e.g. symlink("a.txt", "b.txt") would create a sym-link called b.txt that points back to a.txt. (Assuming a.txt already exists.)

When I use the function in the simulator, I observe the following:

  1. The function call returns successfully, and
  2. No actual sym-link can be found in the file system.

I wrote a simple test program (below) to verify my usage, and it works fine:

use std::os::unix::fs::symlink;

fn main() {
    symlink("/tmp/src.txt", "/tmp/dst.txt").expect("symlink failed");
}

The above shows that I'm using it correctly, with dst.txt being the link that points back to src.txt. Also, the above would've failed if I had the arguments inverted or didn't have the correct permissions, so it's not that, either.

I also ran the simulator with strace to check what the lower-level syscall was actually doing (e.g. maybe that was failing but the Rust std library was not handling the error?), but it shows a successful return code:

symlink("/dev/pts/5", "/home/<user>/Projects/vpanel/ttymxc4") = 0

This is despite the fact that the symlink /home/<user>/Projects/vpanel/ttymxc4 does not really exist. A simple visual inspection with the ls command in the above directory does not show the file, the Python service complains that it cannot find the symlink to its PTY, and the simulator itself reports a panic! when it tries to clean up after itself but fails to find the symlink to remove it:

Drop for ServiceProxy { pty: OpenptyResult { master: 9, slave: 10 }, fspath: "/home/<user>/Projects/vpanel/ttymxc4", timeout: 8s }
thread 'tokio-runtime-worker' panicked at 'remove_file failed: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/proxy.rs:262:35
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
task join failed: JoinError::Panic(...)

The backtrace is not very helpful.

Reference Code

This is what the relevant code looks like in the actual simulator:

// main.rs
#[tokio::main]
async fn main() -> Result<(), Error> {
    let m = App::from(load_yaml!("cli.yaml")).get_matches();
    let ptys: Vec<&str> = m
        .values_of("ptys")
        .expect("Missing PTY paths")
        .collect();

    let proxies = ptys
        .iter()
        .map(|pty| ServiceProxy::new(*pty))
        .collect();

    // ...
}
// proxy.rs
impl ServiceProxy {
    pub fn new(symlink_path: &str) -> Self {
        let pty = openpty(None, None).expect("openpty failed");
        let cstr = unsafe { CStr::from_ptr(ttyname(pty.slave)) };
        let slave_path = String::from(cstr.to_str().expect("CStr::to_str failed"));
        
        // This call claims success, but no link ever shows up
        symlink(&slave_path, &symlink_path).expect("PTY symlink failed");

        // ...
    }
    // ...
}

impl ops::Drop for ServiceProxy {
    fn drop(&mut self) {
        eprintln!("Drop for {:?}", self);
        close(self.pty.master).expect("close master failed");
        close(self.pty.slave).expect("close slave failed");
        
        // The panic shown earlier comes from here
        remove_file(&self.fspath).expect("remove_file failed");
    }
}

Remarks

I don't think any of these should make a difference, but just in case

  • the simulator is using tokio (async/await futures, tasks, etc);
  • the simulator is working with PTYs instead of "regular" files like the short Rust test/example;
  • a simple Python test script using os.symlink(...) works fine.

Update

I added the following code to the simulator, as a test, right after the symlink call:

if Path::new(&symlink_path).exists() {
    eprintln!("What?!: {}", symlink_path);
}
for p in std::fs::read_dir("/home/<user>/Projects/vpanel").unwrap() {
    eprintln!("{:?}", p.unwrap().path().display());
}

Interestingly, it lists the symlink as being present (irrelevant stuff omitted):

What?!: /home/<user>/Projects/vpanel/ttymxc4
...
"/home/<user>/Projects/vpanel/ttymxc4"
...

However, it's never listed by commands such as ls -la or anything. To make sure that there weren't any unexpected remove_file calls, I checked as follows:

$ find src -name '*.rs' | xargs grep remove_file
src/proxy.rs:    fs::remove_file,
src/proxy.rs:        remove_file(&self.fspath).expect("remove_file failed");

The only hit for an actual call in the code base is from the std::ops::Drop implementation. (The top hit is from the use std::{..., fs::remove_file, ...}; block.)

In short, there're no hidden/unexpected/accidental calls to remove_file after the symlink call. There's only the one we already knew about.

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

General comments (7 comments)
General comments
ghost-in-the-zsh‭ wrote over 3 years ago · edited over 3 years ago

I also tried adding this line, as a test replacing the symlink function call, but the observed result was the same as what's already documented in the update:

std::process::Command::new("ln").args(vec!["-s", slave_path.as_str(), symlink_path]).output().expect("ln command failed");
Martin Bonner‭ wrote over 3 years ago · edited over 3 years ago
  1. Try your test program creating a symlink from "/home/ray/Projects/vpanel/ttymxc4" to "/dev/pts/5". What is the behaviour? Does your .exists test also succeed in your test prog? If you wait 10 seconds, does the .exists test still succeed? (Is something destroying the symlink?). What happens if you just use ln -s to create the symlink? (Eg ln -s /dev/pts/5 /home/ray/Projects/vpanel/ttymxc4 && echo Worked). Is the symlink somehow only existing for the duration of the process?
ghost-in-the-zsh‭ wrote over 3 years ago

@MartinBonner: I'm following a few different leads, but to try and quickly answer your questions: I already had a separate test program and it works normally. The .exists check works immediately after the symlink call and printing directory contents from within the program shows the link there. Something is destroying the symlink very quickly -evidence suggests my obj's drop implementation is getting called twice, even though I never call it explicitly (and seems to make no sense).

ghost-in-the-zsh‭ wrote over 3 years ago

@MartinBonner: The behavior is the same if I use std::process::Command::new("ln").args(vec!["-s", slave_path.as_str(), symlink_path]).output().expect("ln command failed"); instead of symlink. The symlink is getting destroyed quickly during execution of the process -and we're talking about a program that doesn't have too many lines of code yet, so it's easy to audit and find ... | xargs grep ... to verify. I haven't been able to update this post, but if I find the answer, I'll post it.

ghost-in-the-zsh‭ wrote over 3 years ago · edited over 3 years ago

@MartinBonner: BTW, I found the double-drop thing when the debugger unexpectedly hit the same breakpoint twice in the drop implementation, and also with this line, which ended up showing 2 files in the directory, instead of 1: std::process::Command::new("mktemp").arg("drop.XXXXXXXX").output().expect("mktemp::drop failed");

ghost-in-the-zsh‭ wrote over 3 years ago · edited over 3 years ago

@MartinBonner Also, if I comment out the remove_file line, the dangling symlink can be seen in the file system after program execution has ended, suggesting that the panic! is from the expected 2nd drop, but caused by an unexpected 1st drop for some reason. (There're more weird observations, such as no eprintln! message showing up for the 1st unexpected/alleged drop, but yes for the 2nd. I'm not sure what I'm missing yet.)

ghost-in-the-zsh‭ wrote over 3 years ago · edited over 3 years ago

I've found the source of the issue. I'll post an answer when I get some time for it.